Quick viewing(Text Mode)

Simple Matrices

Simple Matrices

Appendix B

Simple matrices

Mathematicians also attempted to develop algebra of vectors but there was no natural definition of the of two vectors that held in arbitrary . The first that involved a noncommutative vector product (that is, v w need not equal w v) was proposed by in his book× Ausdehnungslehre× (1844 ). Grassmann’s text also introduced the product of a column and a row matrix, which resulted in what is now called a simple or a -one matrix. In the late 19th century the American mathematical physicist Willard Gibbs published his famous treatise on vector analysis. In that treatise Gibbs represented general matrices, which he called , as sums of simple matrices, which Gibbs called dyads. Later the physicist P. A. M. Dirac introduced the term “bra-ket” for what we now call the product of a “bra” (row) vector times a “ket” (column) vector and the term “ket-bra” for the product of a ket times a bra, resulting in what we now call a simple matrix, as above. Our convention of identifying column matrices and vectors was introduced by physicists in the 20th century. Suddhendu Biswas [52, p.2] − B.1 Rank-1 matrix (dyad)

Any matrix formed from the unsigned of two vectors,

T M N Ψ = uv × (1783) ∈ where u RM and v RN , is rank-1 and called dyad. Conversely, any rank-1 matrix must have the∈ form Ψ . [233∈, prob.1.4.1] Product uvT is a negative dyad. For matrix products ABT, in general, we have −

(ABT) (A) , (ABT) (BT) (1784) R ⊆ R N ⊇ N

B.1 § with equality when B =A [374, §3.3, 3.6] or respectively when B is invertible and (A)= 0. Yet for all nonzero dyads we have N T T T (uv ) = (u) , (uv ) = (v ) v⊥ (1785) R R N N ≡ B.1Proof. (AAT) (A) is obvious. R ⊆ R (AAT) = AATy y Rm R { | ∈ } AATy ATy (AT) = (A) by(146) ¨ ⊇ { | ∈ R } R

Dattorro, Convex Optimization „ Euclidean Distance Geometry 2ε, εβoo, v2018.09.21. 517 M 518 APPENDIX B. SIMPLE MATRICES

(v) (Ψ) = (u) R R R r0 0 r

(Ψ)= (vT) (uT) N N N

RN = (v) ⊞ (uvT) (uT) ⊞ (uvT) = RM R N N R

T M N Figure 183: The four fundamental subspaces [376, §3.6] for any dyad Ψ = uv R × : N ∈M (v) (Ψ) & (uT) (Ψ). Ψ(x), uvTx is a linear mapping from R to R . R ⊥ N N ⊥ R from (v) to (u) is bijective. [374, §3.1] R R

where dim v⊥ =N 1. − It is obvious that a dyad can be 0 only when u or v is 0 ;

Ψ = uvT = 0 u = 0 or v = 0 (1786) ⇔ The matrix 2- for Ψ is equivalent to Frobenius’ norm;

Ψ = σ = uvT = uvT = u v (1787) k k2 1 k kF k k2 k k k k When u and v are normalized, the pseudoinverse is the transposed dyad. Otherwise,

T T vu Ψ† = (uv )† = (1788) u 2 v 2 k k k k

T N N T When dyad uv R × is square, uv has at least N 1 0-eigenvalues and ∈ − corresponding eigenvectors spanning v⊥. The remaining eigenvector u spans the range of uvT with corresponding eigenvalue

λ = vTu = tr(uvT) R (1789) ∈ is a product of the eigenvalues; so, it is always true that

det Ψ = det(uvT) = 0 (1790)

When λ = 1 , the square dyad is a nonorthogonal projector projecting on its range 2 (Ψ =Ψ , §E.6); a projector dyad. It is quite possible that u v⊥ making the remaining eigenvalue instead 0 ;B.2 λ = 0 together with the first N 1∈ 0-eigenvalues; id est, it is possible uvT were nonzero while all its eigenvalues are 0. The− matrix

1 [ 1 1 ] 1 1 = (1791) 1 1 1 · − ¸ · − − ¸ for example, has two 0-eigenvalues. In other words, eigenvector u may simultaneously be a member of the nullspace and range of the dyad. The explanation is, simply, because u and v share the same , dim u = M = dim v = N :

B.2

A dyad is not always diagonalizable ( §A.5) because its eigenvectors are not necessarily independent. B.1. RANK-1 MATRIX (DYAD) 519

Proof. Figure 183 sees the four fundamental subspaces for the dyad. Linear Ψ : RN RM provides a map between vector spaces that remain distinct when M =N ; → u (uvT) ∈ R u (uvT) vTu = 0 (1792) ∈ N ⇔ (uvT) (uvT) = R ∩ N ∅ ¨

B.1.0.1 rank-1 modification

M N N M T B.3 For A R × , x R , y R , and y Ax=0 [238, §2.1] ∈ ∈ ∈ 6 AxyTA rank A = rank(A) 1 (1794) − yTAx − µ ¶ N N T 1 Given nonsingular matrix A R × Ä 1 + v A− u =0, [178, §4.11.2] [253, App.6] ∈ 6 [463, §2.3 prob.16] (Sherman-Morrison-Woodbury)

1 T 1 T 1 1 A− uv A− (A uv )− = A− (1795) ± ∓ 1 vTA 1u ± − B.1.0.2 dyad symmetry

T N N In the specific circumstance that v = u , then uu R × is symmetric, rank-1 , and positive semidefinite having exactly N 1 0-eigenvalues.∈ In fact, (Theorem A.3.1.0.7) − uvT 0 v = u (1796) º ⇔ and the remaining eigenvalue is almost always positive;

λ = uTu = tr(uuT) > 0 unless u=0 (1797)

When λ = 1 , the dyad becomes an orthogonal projector. Matrix Ψ u (1798) uT 1 · ¸ for example, is rank-1 positive semidefinite if and only if Ψ = uuT.

B.1.1 Dyad independence Now we consider a sum of dyads like (1783) as encountered in diagonalization and singular value decomposition:

k k k s wT = s wT = (s ) w i are l.i. (1799) R i i R i i R i ⇐ i ∀ Ãi=1 ! i=1 i=1 X X ¡ ¢ X B.3This rank-1 modification formula has a Schur progenitor, in the symmetric case: minimize c c A Ax (1793) subject to 0 yTA c º · ¸ T T T Axy A has analytical solution by (1679b): c y AA†Ax = y Ax . Difference A comes from (1679c). ≥ − yTAx Rank modification is provable via Theorem A.4.0.1.3. 520 APPENDIX B. SIMPLE MATRICES range of summation is the vector sum of ranges.B.4 (Theorem B.1.1.1.1) Under the assumption the dyads are linearly independent (l.i.), then vector sums are unique (p.631): for w l.i. and s l.i. { i} { i} k s wT = s wT . . . s wT = (s ) . . . (s ) (1800) R i i R 1 1 ⊕ ⊕ R k k R 1 ⊕ ⊕ R k Ãi=1 ! X ¡ ¢ ¡ ¢ B.1.1.0.1 Definition. Linearly independent dyads. [243, p.29 thm.11] [382, p.2] The of k dyads s wT i=1 ...k (1801) i i | where s CM and w CN , is said© to be linearly independentª iff i ∈ i ∈ k T T rank SW , si wi = k (1802) Ã i=1 ! X M k N k where S , [s s ] C × and W , [w w ] C × . 1 ··· k ∈ 1 ··· k ∈ △ Dyad independence does not preclude existence of a nullspace (SW T) , as defined, nor does it imply SW T were full-rank. In absence of assumptionN of independence, generally, rank SW T k . Conversely, any rank-k matrix can be written in the form T ≤ SW by singular value decomposition. (§A.6)

B.1.1.0.2 Theorem. Linearly independent (l.i.) dyads. M N Vectors si C , i=1 ...k are l.i. and vectors wi C , i=1 ...k are l.i. if and only { ∈ T M N } { ∈ } if dyads s w C × , i=1 ...k are l.i. { i i ∈ } ⋄ Proof. of k dyads is identical to definition (1802). ( ) Suppose s and w are each linearly independent sets. Invoking Sylvester’s rank

⇒ { i} { i} § inequality, [233, §0.4] [463, 2.4]

rank S + rank W k rank(SW T) min rank S , rank W ( k) (1803) − ≤ ≤ { } ≤ Then k rank(SW T) k which implies the dyads are independent. ( ) Conversely,≤ suppose≤ rank(SW T)= k . Then ⇐ k min rank S , rank W k (1804) ≤ { } ≤ implying the vector sets are each independent. ¨

B.1.1.1 Biorthogonality condition, Range and Nullspace of Sum Dyads characterized by biorthogonality condition W TS = I are independent; id est, for M k N k T T S C × and W C × , if W S = I then rank(SW )= k by the linearly independent ∈ ∈ dyads theorem because (confer §E.1.1)

W TS = I rank S = rank W = k M =N (1805) ⇒ ≤ B.5 To see that, we need only show: (S)= 0 B Ä BS =I . ( ) Assume BS =I . Then (BSN )= 0= ⇔x BSx ∃ = 0 (S). (1784) ⇐ N { | } ⊇ N B.4 Move of range to inside summation admitted by linear independence of wi . B.5 R { } Left inverse is not unique, in general. B.2. DOUBLET 521

([ u v ]) (Π)= ([u v]) R R R r0 0 r

(Π)= v⊥ u⊥ v⊥ u⊥ N ∩ ∩

vT vT RN = ([ u v ]) ⊞ ⊞ ([ u v ]) = RN R N uT N uT R µ· ¸¶ µ· ¸¶ T T N Figure 184: The four fundamental subspaces [376, §3.6] for doublet Π = uv + vu S . Π(x)=(uvT + vuT)x is a linear bijective mapping from ([ u v ]) to ([ u v ]). ∈ R R

( ) If (S)= 0 then S must be full-rank thin-or-square. ⇒ N B ∴ A,B,C Ä [ S A ] = I (id est, [ S A ] is invertible) BS =I . ∃ C ⇒ · ¸ Left inverse B is given as W T here. Because of reciprocity with S , it immediately follows: (W )= 0 S Ä STW = I . ¨ N ⇔ ∃ Dyads produced by diagonalization, for example, are independent because of their inherent

biorthogonality. (§A.5.0.3) The converse is generally false; id est, linearly independent dyads are not necessarily biorthogonal.

B.1.1.1.1 Theorem. Nullspace and range of dyad sum. T M k N k Given a sum of dyads represented by SW where S C × and W C × ∈ ∈ (SW T) = (W T) B Ä BS = I N N ⇐ ∃ (1806) (SW T) = (S) Z Ä W TZ = I R R ⇐ ∃ ⋄ Proof. ( ) (SW T) (W T) and (SW T) (S) are obvious. ⇒ N ⊇ N R k N ⊆ R N k B.6 ( ) Assume existence of a left inverse B R × and a right inverse Z R × . ⇐ ∈ ∈ (SW T) = x SW Tx = 0 x BSW Tx = 0 = (W T) (1807) N { | } ⊆ { | } N (SW T) = SW Tx x RN SW TZy Zy RN = (S) (1808) R { | ∈ } ⊇ { | ∈ } R ¨

B.2 Doublet

Consider a sum of two linearly independent square dyads, one a transposition of the other: [ u v ] vT Π = uvT + vuT = = SW T SN (1809) uT ∈ · ¸ where u,v RN . Like the dyad, a doublet can be 0 only when u or v is 0 ; ∈ Π = uvT + vuT = 0 u = 0 or v = 0 (1810) ⇔ By assumption of independence, a nonzero doublet has two nonzero eigenvalues

λ , uTv + uvT , λ , uTv uvT (1811) 1 k k 2 − k k B.6By counterexample, the theorem’s converse cannot be true; e.g, S = W = [ 1 0 ] . 522 APPENDIX B. SIMPLE MATRICES

(uT) (E) = (vT) N R N r0 0 r

(E)= (u) (v) N R R

RN = (uT) ⊞ (E) (v) ⊞ (E) = RN N N R R

T Figure 185: v u=1/ζ . The four fundamental subspaces [376, §3.6] for uvT E as a linear mapping E(x) = I x . − vTu µ ¶ where λ1 > 0 > λ2 , with corresponding eigenvectors u v u v x , + , x , (1812) 1 u v 2 u − v k k k k k k k k spanning the doublet range. Eigenvalue λ1 cannot be 0 unless u and v have opposing directions, but that is antithetical since then the dyads would no longer be independent. Eigenvalue λ2 is 0 if and only if u and v share the same direction, again antithetical. Generally we have λ1 > 0 and λ2 < 0 , so Π is indefinite. By the nullspace and range of dyad sum theorem, doublet Π has N 2 zero-eigenvalues vT − remaining and corresponding eigenvectors spanning . We therefore have N uT µ· ¸¶

(Π) = ([ u v ]) , (Π) = v⊥ u⊥ (1813) R R N ∩ of respective dimension 2 and N 2. (Figure 184) − B.3 Elementary matrix

A matrix of the form T N N E = I ζuv R × (1814) − ∈ where ζ R is finite and u,v RN , is called elementary matrix or rank-1 modification ∈ ∈ N N of the Identity. [235] Any elementary matrix in R × has N 1 eigenvalues equal to 1 − corresponding to real eigenvectors that span v⊥. The remaining eigenvalue λ = 1 ζvTu (1815) − corresponds to eigenvector u .B.7 From [253, App.7.A.26] the determinant:

det E = 1 tr ζuvT = λ (1816) −

If λ = 0 then E is invertible; [178] (confer §B.1.0.1¡ )¢ 6 1 ζ T E − = I + uv (1817) λ Eigenvectors corresponding to 0 eigenvalues belong to (E) , and the number of 0 eigenvalues must be at least dim (E) which, here,N can be at most one. N B.7Elementary matrix E is not always diagonalizable because eigenvector u need not be independent of the others; id est, u v is possible. ∈ ⊥ B.3. ELEMENTARY MATRIX 523

T (§A.7.3.0.1) The nullspace exists, therefore, when λ=0 ; id est, when v u=1/ζ ; rather, whenever u belongs to hyperplane z RN vTz =1/ζ . Then (when λ=0) elementary { ∈ | } 2 matrix E is a nonorthogonal projector projecting on its range (E =E , §E.1) and (E)= (u) ; eigenvector u spans the nullspace when it exists. By conservation of N R dimension, dim (E)=N dim (E). It is apparent from (1814) that v⊥ (E) , R − N ⊆ R but dim v⊥ =N 1. Hence (E) v⊥ when the nullspace exists, and the remaining eigenvectors span− it. R ≡ In summary, when a nontrivial nullspace of E exists,

(E) = (vT) , (E) = (u) , vTu = 1/ζ (1818) R N N R illustrated in Figure 185, which is opposite to the assignment of subspaces for a dyad (Figure 183). Otherwise, (E)= RN . R When E = ET, the spectral norm is

E = max 1 , λ (1819) k k2 { | |}

B.3.1 Householder matrix

An elementary matrix is called a Householder matrix when it has the defining form, for

§ § § nonzero vector u [185, §5.1.2] [178, 4.10.1] [374, 7.3] [233, 2.2]

uuT H = I 2 SN (1820) − uTu ∈

1 T which is a symmetric orthogonal (reflection) matrix (H− =H =H (§B.5.3)). Vector u is normal to an N 1-dimensional subspace u⊥ through which this particular H effects − reflection; e.g, Hu⊥ =u⊥ while Hu = u . − Matrix H has N 1 orthonormal eigenvectors spanning that reflecting subspace − u⊥ with corresponding eigenvalues equal to 1. The remaining eigenvector u has corresponding eigenvalue 1; so − det H = 1 (1821) − Due to symmetry of H , the matrix 2-norm (the spectral norm) is equal to the largest eigenvalue-magnitude. A Householder matrix is thus characterized,

T 1 T H = H , H− = H , H = 1 , H ² 0 (1822) k k2 For example, the

1 0 0 Ξ = 0 0 1 (1823)  0 1 0    is a Householder matrix having u=[0 1 1 ]T/√2 . Not all permutation matrices are − Householder matrices, although all permutation matrices are orthogonal matrices ( §B.5.2, T Ξ Ξ = I) [374, §3.4] because they are made by permuting rows and columns of the . Neither are all symmetric permutation matrices Householder matrices; 0 0 0 1 0 0 1 0 e.g, Ξ = (1920) is not a Householder matrix.  0 1 0 0   1 0 0 0      524 APPENDIX B. SIMPLE MATRICES

B.4 Auxiliary V -matrices B.4.1 Auxiliary projector matrix V It is convenient to define a matrix V that arises naturally as a consequence of translating T geometric center αc (§5.5.1.0.1) of some list X to the origin. In place of X αc1 we may write XV as in (1135) where − 1 V = I 11T SN (1071) − N ∈ is an elementary matrix called the geometric centering matrix. N N Any elementary matrix in R × has N 1 eigenvalues equal to 1. For the particular elementary matrix V , the N th eigenvalue equals− 0. The number of 0 eigenvalues must T equal dim (V ) = 1 , by the 0 eigenvalues theorem (§A.7.3.0.1), because V =V is diagonalizable.N Because V 1 = 0 (1824) the nullspace (V )= (1) is spanned by the eigenvector 1. The remaining eigenvectors N R T span (V ) 1⊥ = (1 ) that has dimension N 1. BecauseR ≡ N − V 2 = V (1825) T and V = V , elementary matrix V is also a matrix (§E.3) projecting orthogonally on its range (1T) which is a hyperplane containing the origin in RN N T 1 T V = I 1(1 1)− 1 (1826) − The 0, 1 eigenvalues also indicate that diagonalizable V is a . { } [463, §4.1 thm.4.1] Symmetry of V denotes orthogonal projection; from (2120),

2 T V = V, V = V, V † = V , V = 1 , V 0 (1827) k k2 º Matrix V is also circulant [197].

B.4.1.0.1 Example. Relationship of Auxiliary to Householder matrix. Let H SN be a Householder matrix (1820) defined by ∈ 1 . . N u =   R (1828) 1 ∈  √   1 + N   

Then we have [180, §2] I 0 V = H H (1829) 0T 0 N · ¸ Let D Sh and define ∈ A b HDH , (1830) − − bT c

· ¸ § where b is a vector. Then because H is nonsingular (§A.3.1.0.5) [215, 3]

A 0 VDV = H H 0 A 0 (1831) − − 0T 0 º ⇔ − º · ¸ and affine dimension is r = rank A when D is a Euclidean . 2 B.4. AUXILIARY V -MATRICES 525

B.4.2 Schoenberg auxiliary matrix V N T 1 1 N N 1 1. V = − R × − (1055) N √2 I ∈ · ¸ 2. V T1 = 0 N T 3. I e11 = 0 √2V − N 4. 0 √2V £V = V ¤ N N N 5. £ 0 √2V ¤ V = V N 6. £V 0 √2V¤ = 0 √2V N N 7. 0£ √2V ¤0 £√2V = ¤0 √2V N N N £ ¤ £ 0 0T¤ £ ¤ 8. 0 √2V † = V N 0 I · ¸ £ ¤ 9. 0 √2V † V = 0 √2V † N N 10. £ 0 √2V ¤ 0 √£2V † = V¤ N N £ ¤ £ ¤ 0 0T 11. 0 √2V † 0 √2V = N N 0 I · ¸ £ ¤ £ ¤ 0 0T 12. 0 √2V = 0 √2V N 0 I N · ¸ £ ¤ £ ¤ 0 0T 0 0T 13. 0 √2V = 0 I N 0 I · ¸ · ¸ £ ¤ V † 1 1 14. [ V 1 ]− = N N √2 √2 T " N 1 # √ 1 1 T N 1 N 1 T N 1 15. V † = 2 N 1 I N 11 R − × , I N 11 S − N − − ∈ − ∈ £ ¤ ³ ´ 16. V † 1 = 0 N T 1 T N 1 17. V † V = I, V V = 2 (I + 11 ) S − N N N N ∈ T 1 T N 18. V = V = V V † = I N 11 S N N − ∈ T T N 19. V † (11 I)V = I , 11 I EDM − N − N − ∈ ³ ´ N 20. D = [dij ] S (1073) ∈ h 1 T 1 T 1 tr( VDV ) = tr( V D) = tr( V † DV ) = N 1 D 1 = N tr(11 D) = N dij − − − N N i,j N P Any elementary matrix E S of the particular form ∈ E = k I k 11T (1832) 1 − 2 B.8 where k , k R , will make tr( ED) proportional to d . 1 2 ∈ − ij B.8 R If k1 is 1 ρ while k2 equals ρ , then all eigenvalues of E for 1/(NP 1) <ρ< 1 are guaranteed positive and therefore− E is guaranteed− ∈ positive definite. [343] − − 526 APPENDIX B. SIMPLE MATRICES

21. D = [d ] SN ij ∈ 1 N 1 1 T tr( VDV ) = N dij N− dii = N 1 D 1 tr D − i,j − i − i=j P6 P 22. D = [d ] SN ij ∈ h T tr( V DV ) = d1j − N N j P 23. For Y SN ∈ V (Y δ(Y 1))V = Y δ(Y 1) − − B.4.3 Orthonormal auxiliary matrix V W Thin matrix 1 1 1 − − − √N √N ··· √N 1 1 1  1 + − − −  N+√N N+√N ··· N+√N  . .   1 .. .. 1  N N 1 V ,  −√ −√  R × − (1833) W  N+ N N+ N  ∈    ......   . . . .     1 1 1   − − 1 + −   N+√N N+√N ··· N+√N    has (V )= (1T) and orthonormal columns. [7] We defined three auxiliary V -matrices: V , RV W(1055N), and V sharing some attributes listed in Table B.4.4. For example, V can beN expressed W T V = V V = V V † (1834) W W N N but V T V = I means V is an orthogonal projector (2117) and W W T T V † = V , V 2 = 1 , V 1 = 0 (1835) W W k W k W B.4.4 Auxiliary V -matrix Table T T T dim V rank V (V ) (V ) V V VV VV † R N V N N N 1 (1T) (1) V V V × − N R N 1 1T V N (N 1) N 1 (1T) (1) 1 (I + 11T) 1 − − V N × − − N R 2 2 1 I · − ¸ V N (N 1) N 1 (1T) (1) I V V W × − − N R

B.4.5 More auxiliary matrices § Mathar shows [295, §2] that any elementary matrix ( B.3) of the form T N N V = I b 1 R × (1836) S − ∈ T such that b 1 = 1 (confer [190, §2]), is an auxiliary V -matrix having (V T) = (bT) , (V ) = (1T) R S N R S N (1837) (V ) = (b) , (V T) = (1) N S R N S R B.5. ORTHOMATRICES 527

n N 1 T Given X R × , the choice b = 1 (V = V ) minimizes X(I b 1 ) F . [192, §3.2.1] ∈ N S k − k

B.5 Orthomatrices

B.5.1 Orthonormal matrix

T n k Property Q Q = I completely defines orthonormal matrix Q R × (k n); a full-rank thin-or- characterized by nonexpansivity (2121) ∈ ≤

QTx x x Rn, Qy = y y Rk (1838) k k2 ≤ k k2 ∀ ∈ k k2 k k2 ∀ ∈ and preservation of vector inner-product

Qy, Qz = y,z (1839) h i h i

B.5.2 & vector

1 T An orthogonal matrix is a square orthonormal matrix. Property Q− = Q completely

n n § defines orthogonal matrix Q R × employed to effect vector rotation; [374, §2.6, 3.4]

∈ n § [376, §6.5] [233, 2.1] for any x R ∈ Qx = x (1840) k k2 k k2 In other words, the 2-norm is orthogonally . Any antisymmetric matrix constructs an orthogonal matrix; id est, for A = AT − 1 Q = (I + A)− (I A) (1841) − A is a complex generalization of orthogonal matrix; conjugate 1 H B.9 defines it: U − = U . An orthogonal matrix is simply a real unitary matrix. Orthogonal matrix Q is a further characterized by norm:

1 T 2 2 Q− = Q , Q = 1 , Q = n (1842) k k2 k kF Applying this characterization to QT, we see that it too is an orthogonal matrix because transpose inverse equals inverse transpose. Hence the rows and columns of Q each

respectively form an orthonormal set. Normalcy guarantees diagonalization (§A.5.1.0.1). So, for Q , SΛSH,

1 H T H T SΛ− S = S∗ΛS = SΛ∗S , δ(Λ) = 1 , 1 δ(Λ) = n (1843) k k∞ | | characterizes an orthogonal¡ matrix in¢ terms of eigenvalues and eigenvectors. All permutation matrices Ξ , for example, are nonnegative orthogonal matrices; and vice versa. Product or of any permutation matrices remains a permutator. Any product of permutation matrix with orthogonal matrix remains orthogonal. In fact, any product AQ of orthogonal matrices A and Q remains orthogonal by definition. Given any other dimensionally compatible orthogonal matrix U , the mapping g(A)= U TAQ is a on the domain of orthogonal matrices (a nonconvex 1 of dimension n(n 1) [56]). [273, §2.1] [274] 2 − B.9Orthogonal and unitary matrices are called unitary linear operators. 528 APPENDIX B. SIMPLE MATRICES

The largest magnitude entry of an orthogonal matrix is 1 ; for each and every j 1 ...n ∈ Q(j , :) 1 ∞ (1844) kQ(: ,j)k ≤ 1 k k∞ ≤ 1 Each and every eigenvalue of a (real) orthogonal matrix has magnitude 1 (Λ− = Λ∗)

λ(Q) Cn , λ(Q) = 1 (1845) ∈ | | but only the Identity matrix can be simultaneously orthogonal and positive definite. Orthogonal matrices have complex eigenvalues in conjugate pairs: so det Q= 1. ± B.5.3 Reflection A matrix for pointwise reflection is defined by imposing symmetry upon the orthogonal 1 T matrix; id est, a reflection matrix is completely defined by Q− = Q = Q . The reflection matrix is a symmetric orthogonal matrix, and vice versa, characterized:

T 1 T Q = Q , Q− = Q , Q = 1 (1846) k k2

The Householder matrix (§B.3.1) is an example of symmetric orthogonal (reflection) matrix. Reflection matrices have eigenvalues equal to 1 and so det Q= 1. It is natural to expect a relationship between reflection and projection± matrices because± all projection matrices have eigenvalues belonging to 0, 1 . In fact, any reflection matrix Q is related { } to some orthogonal projector P by [235, §1 prob.44]

Q = I 2P (1847) −

Yet P is, generally, neither orthogonal or invertible. (§E.3.2)

λ(Q) Rn , λ(Q) = 1 (1848) ∈ | | Whereas P connotes projection on (P ) , here Q connotes reflection with respect to R (P )⊥. R Matrix I 2(I P ) = 2P I represents antireflection; id est, reflection about (P ). − − − R Every orthogonal matrix can be expressed as the product of a rotation and a reflection. The collection of all orthogonal matrices of particular dimension forms a nonconvex set; topologically, it is instead referred to as a manifold.

2 B.5.3.0.1 Example. Pythagorean sum by antireflection in R . T 2 Figure 186 illustrates a process for determining magnitude of vector p0 = [ x y ] R+ . The given point p is assumed to be a member of quadrant I with y x .B.10 The∈ idea 0 ≤ is to rotate p0 into alignment with the x axis; its length then becomes equivalent to the rotated x coordinate. Rotation is accomplished by iteration (i index): T T First, p0 is projected on the x axis; [ x 0 ] . Vector u0 = [ x y/2 ] is a bisector of difference p [ x 0 ]T, translated to projection [ x 0 ]T, and is the perpendicular bisector 0 − of chord p0p1 . Point p1 is the antireflection of p0 ;

u uT p = 2 0 0 I p (1849) 1 uTu − 0 µ 0 0 ¶ B.10 R2 p0 belongs to the monotone nonnegative cone + + . KM ⊂ B.5. ORTHOMATRICES 529

x p = 0 y · ¸

2 R x p1 u = 0 y/2 · ¸

u1

p x 2 0 0 · ¸ Figure 186: [145] First two of three iterations required, in absence of square root , 2 2 to calculate x + y to 20 significant digits (presuming infinite ). Point p0 is given. u is a bisector of difference [ 0 y ]T, translated to [ x 0 ]T, and is the perpendicular 0 p bisector of chord p0p1 ; similarly, for u1 . [255, §16/4] Assuming that Moler & Morrison condition [302] y x holds, iterate p (not shown) is always on quarter circle close to x | |≤ | | 3 axis. (p0 violates Moler & Morrison condition, for illustration. Cartesian axes not drawn.)

In the second iteration, Figure 186 illustrates on p1 and construction of p2 . Because p2 is the antireflection of p1 , each must be equidistant from the perpendicular bisector through u1 of chord p1p2 . T A third iteration completes the process. Say pi = [ xi yi ] with [ x0 y0 ],[ x y ] the T given point. It is known that the particular selection of bisector ui = [ xi yi/2 ] diminishes total required number of iterations to three for 20 significant digits ([302] presuming exact arithmetic). The entire sequence of operations is illustrated programmatically by this Matlab 2 subroutine to calculate magnitude of any vector in R by antireflection: function z = pythag2(x,y) %z(1)=sqrt(x^2+y^2), z(2)=y coordinate of p3 if ~[x;y] z = [x;y]; else z = sort(abs([x;y]),’descend’); for i=1:3 u = [z(1); z(2)/2]; z = 2*u*(u’*z)/(u’*u) - z; end end end

Pythagorean sum sans square root was first published in 1983 by Moler & Morrison [302] who did not describe antireflection. 2 530 APPENDIX B. SIMPLE MATRICES

B.5.3.0.2 Exercise. Pythagorean sum: x Rn sans square root sans sort. k ∈ k 2 Matlab subroutine pythag2() calculates magnitude of any vector in R . By constraining perpendicular bisector angle Áu π , show that sorting condition y x is obviated: i ≤ 4 | |≤ | | function z = pythag2d(x,y) %z(1)=sqrt(x^2+y^2), z(2)=y coordinate of p4 if ~[x;y] z = [x;y]; elseif ~x z = [abs(y); 0]; else z = abs([x;y]); for i=1:4 u = [z(1); min(z(1), z(2)/2)]; z = 2*u*(u’*z)/(u’*u) - z; end end end 2 Elimination of sort() incurred an extra iteration in R . Antireflection admits simple modifications that enable calculation of Euclidean norm for any vector in Rn. Show that Euclidean norm may be calculated with only a small number of iterations in absence of square root and sorting functions: function z = pythag3d(x) %z(1)=||x||, z(2:end)=coordinates of p13 if ~x z = x; elseif ~x(1) z = [pythag3d(x(2:end)); 0]; else z = abs(x); for i=1:13 u = [z(1); min(z(1), z(2:end)/2)]; z = 2*u*(u’*z)/(u’*u) - z; end end end

Projection is on the first nonzero coordinate axis, as in §B.5.3.0.1, then antireflection is about a vector ui normal to a hyperplane. Demonstrate that number of required iterations

does not grow linearly with n ; contrary to [302, §3], growth is much slower. H

B.5.4 Rotation of range and rowspace

Given orthogonal matrix Q , column vectors of a matrix X are simultaneously rotated 3 N about the origin via product QX . In three dimensions (X R × ), precise meaning of rotation is best illustrated in Figure 187 where a gimbal aids∈ visualization of what is

achievable; mathematically, (§5.5.2.0.1)

cos θ 0 sin θ 1 0 0 cos φ sin φ 0 Q = 0 1− 0 0 cos ψ sin ψ sin φ −cos φ 0 (1850)  sin θ 0 cos θ  0 sin ψ −cos ψ  0 0 1      B.5. ORTHOMATRICES 531

Figure 187: Gimbal: a mechanism imparting three degrees of dimensional freedom to a Euclidean body suspended at its center. Each is free to rotate about one axis. (Drawing by courtesy of The MathWorks Inc.)

B.5.4.0.1 Example. One axis of revolution. Rn Partition n + 1-dimensional Rn+1 , and define an n-dimensional R subspace · ¸ , λ Rn+1 1Tλ = 0 (1851) R { ∈ | } (a hyperplane through the origin). We want an orthogonal matrix that rotates a list in the n+1 N n columns of matrix X R × through the dihedral angle between R and (§2.4.3) ∈ R

en+1 , 1 1 Á(Rn, ) = arccos h i = arccos radians (1852) R e 1 √n+1 µk n+1k k k¶ µ ¶ The vertex-description of the nonnegative orthant in Rn+1 is

[ e e e ] a a 0 = a 0 = Rn+1 Rn+1 (1853) { 1 2 ··· n+1 | º } { º } + ⊂ Consider rotation of these vertices via orthogonal matrix

1 n+1 n+1 Q , [ 1 ΞV ]Ξ R × (1854) √n+1 W ∈

n+1 n+1 n where permutation matrix Ξ S is defined in (1920) and where V R × is the ∈ W ∈ orthonormal auxiliary matrix defined in §B.4.3. This particular orthogonal matrix is selected because it rotates any point in subspace Rn about one axis of revolution onto ; R e.g, rotation Qen+1 aligns the last standard vector with subspace normal ⊥ = 1. The rotated vectors remaining are orthonormal spanning . R2 R Another interpretation of product QX is rotation/reflection of (X). Rotation of X as in QXQT is a simultaneous rotation/reflection of range and rowspace.R B.11

Proof. Any matrix can be expressed as a singular value decomposition X = UΣW T (1734) where δ2(Σ) = Σ , (U) (X) , and (W ) (XT). ¨ R ⊇ R R ⊇ R

B.11Product QTA Q can be regarded as coordinate transformation; e.g, given y =Ax : Rn Rn and orthogonal Q , the transformation Qy =AQx is a rotation/reflection of range and rowspace (145→) of matrix A where Qy (A) and Qx (AT) (146). ∈ R ∈ R 532 APPENDIX B. SIMPLE MATRICES

0

1

2

3

4

5

6

7

8

9

10

11 0 2 4 6 8 10 nz = 28 Figure 188: 10 10 arrow matrix. Twenty eight nonzero (nz) entries indicated. ×

B.5.5 Matrix rotation Orthogonal matrices are also employed to rotate/reflect other matrices like vectors: T n n [185, §12.4.1] Given orthogonal matrix Q , the product Q A will rotate A R × in 2 n ∈ the Euclidean sense in R because Frobenius’ norm is orthogonally invariant (§2.2.1);

QTA = tr(ATQQTA) = A (1855) k kF k kF q (likewise for AQ). Were A symmetric, such a rotation would depart from Sn. One remedy is to instead form product QTA Q because

QTA Q = tr(QTATQQTA Q) = A (1856) k kF k kF

By §A.1.1 no.33, q vec QTA Q = (Q Q)T vec A (1857) ⊗ which is a rotation of the vectorized A matrix because Kronecker product of any orthogonal

matrices remains orthogonal; e.g, by §A.1.1 no.43, (Q Q)T(Q Q) = I (1858) ⊗ ⊗ Matrix A is orthogonally equivalent to B if B = STAS for some orthogonal matrix S . Every square matrix, for example, is orthogonally equivalent to a matrix having equal

entries along the main diagonal. [233, §2.2, prob.3]

B.6 Arrow matrix

Consider a partitioned symmetric n n-dimensional matrix A that has arrow [330] (or × n 1 arrowhead [373]) form constituted by vectors a, b R − and real scalar c : ∈ δ(a) b A , Sn (1859) bT c ∈ · ¸ Figure 188 illustrates sparsity pattern of an arrow matrix. Embedding of δ(a) makes relative sparsity increasing with dimension. Because an arrow matrix is a kind of bordered matrix, eigenvalues of δ(a) and A are interlaced; T T T λn (Ξ a)n 1 λn 1 (Ξ a)n 2 (Ξ a)1 λ1 (1860)

≤ − ≤ − ≤ − ≤ · · · ≤ ≤

§ § [374, §6.4] [233, 4.3] [370, IV.4.1] denoting nonincreasingly ordered eigenvalues of A by n T n 1 vector λ R , and those of δ(a) by Ξ a R − where Ξ is a permutation matrix arranging ∈ ∈T T a into nonincreasing order: δ(a)= Ξδ(Ξ a)Ξ (§A.5.1.2). B.6. ARROW MATRIX 533

B.6.1 positive semidefinite arrow matrix i) Nonnegative main diagonal a 0 insures n 1 nonnegative eigenvalues in (1860). º −

Positive semidefiniteness is left determined by smallest eigenvalue λn :

T T a 0 , b I δ(a)δ(a)† = 0 , c b δ(a)†b 0 A 0 ⇔ º − − T ≥ (1861) º c 0 , b(1 cc†) = 0 , δ(a) c†bb 0 ⇔ ≥ ¡− ¢ − º T Schur complement condition (§A.4) b I δ(a)δ(a)† = 0 is most simply a requirement for − ii) a zero entry in vector b wherever¡ there is a corresponding¢ zero entry in vector a .

n 1 In other words, vector b can reside anywhere in a Cartesian subspace of R − that is determined solely by indices of the nonzero entries in vector a .

T iii) c b δ(a)†b provides a tight lower bound for scalar c . ≥ T As shown in §3.5.1, b δ(a)†b is simultaneously convex in vectors a and b .