Chapter 3 Adjoint Representations and the Derivative Of
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 3 Adjoint Representations and the Derivative of exp 3.1 The Adjoint Representations Ad and ad Given any two vector spaces E and F ,recallthatthe vector space of all linear maps from E to F is denoted by Hom(E,F). The vector space of all invertible linear maps from E to itself is a group denoted GL(E). When E = Rn,weoftendenoteGL(Rn)byGL(n, R) (and if E = Cn,weoftendenoteGL(Cn)byGL(n, C)). The vector space Mn(R)ofalln n matrices is also ⇥ denoted by gl(n, R)(andMn(C)bygl(n, C)). 121 122 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp Then, GL(gl(n, R)) is the vector space of all invertible linear maps from gl(n, R)=Mn(R)toitself. For any matrix A MA(R)(orA MA(C)), define the 2 2 maps LA :Mn(R) Mn(R)andRA :Mn(R) Mn(R) by ! ! LA(B)=AB, RA(B)=BA, for all B Mn(R). 2 Observe that LA RB = RB LA for all A, B Mn(R). ◦ ◦ 2 For any matrix A GL(n, R), let 2 AdA :Mn(R) Mn(R)(conjugationbyA) ! be given by 1 AdA(B)=ABA− for all B Mn(R). 2 Observe that AdA = LA R 1 and that AdA is an ◦ A− invertible linear map with inverse Ad 1. A− 3.1. ADJOINT REPRESENTATIONS Ad AND ad 123 The restriction of AdA to invertible matrices B GL(n, R) yields the map 2 AdA : GL(n, R) GL(n, R) ! also given by 1 AdA(B)=ABA− for all B GL(n, R). 2 This time, observe that AdA is a group homomorphism (with respect to multiplication), since 1 AdA(BC)=ABCA− 1 1 = ABA− ACA− = AdA(B)AdA(C). In fact, AdA is a group isomorphism (because its inverse is Ad 1). A− 124 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp Beware that AdA is not alinearmaponGL(n, R)be- cause GL(n, R)isnotavectorspace! However, GL(n, R)isanopensubsetofMn(R), because it is the complement of the set of singular matrices A Mn(R) det(A)=0 , { 2 | } aclosedset,sinceitistheinverseimageoftheclosedset 0 by the determinant function, which is continuous. { } 3.1. ADJOINT REPRESENTATIONS Ad AND ad 125 Since GL(n, R)isanopensubsetofMn(R), for every B GL(n, R), there is an open ball B(B,⌘) GL(n, R) 2 ✓ such that B + X B(B,⌘)forallX Mn(R)with X <⌘,soAd (2B + X)iswelldefinedand2 k k A AdA(B + X) AdA(B) − 1 1 1 = A(B + X)A− ABA− = AXA− , − which shows that d(AdA)B exists and is given by 1 d(AdA)B(X)=AXA− , for all X Mn(R). 2 126 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp In particular, for B = I,weseethatthederivative d(AdA)I of AdA at I is a linear map of gl(n, R)=Mn(R) denoted by Ad(A)orAdA (or Ad A), and given by 1 AdA(X)=AXA− for all X gl(n, R). 2 The inverse of AdA is AdA 1,soAdA GL(gl(n, R)). − 2 Note that Ad =Ad Ad , AB A ◦ B so the map A Ad is a group homomorphism denoted 7! A Ad: GL(n, R) GL(gl(n, R)). ! The homomorphism Ad is called the adjoint representa- tion of GL(n, R). We also would like to compute the derivative d(Ad)I of Ad at I. 3.1. ADJOINT REPRESENTATIONS Ad AND ad 127 For all X, Y Mn(R), with X small enough we have 2 k k I + X GL(n, R), and 2 AdI+X(Y ) AdI(Y ) (XY YX) 2 − − 1− =(YX XY X)(I + X)− . − Then, if we let 2 1 (YX XY X)(I + X)− ✏(X, Y )= − , X k k we proved that for X small enough k k Ad (Y ) Ad (Y )=(XY YX)+✏(X, Y ) X , I+X − I − k k 1 with ✏(X, Y ) 2 X Y (I + X)− ,andwith ✏(X, Yk)linearinkY . k kk k 128 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp Let adX : gl(n, R) gl(n, R)bethelinearmapgivenby ! ad (Y )=XY YX =[X, Y ], X − and ad be the linear map ad: gl(n, R) Hom(gl(n, R), gl(n, R)) ! given by ad(X)=adX. We also define ✏X : gl(n, R) gl(n, R)asthelinearmap given by ! ✏X(Y )=✏(X, Y ). If ✏ is the operator norm of ✏ ,wehave k Xk X 1 ✏X =max ✏(X, Y ) 2 X (I + X)− . k k Y =1 k k k k k k 3.1. ADJOINT REPRESENTATIONS Ad AND ad 129 Then, the equation Ad (Y ) Ad (Y )=(XY YX)+✏(X, Y ) X , I+X − I − k k which holds for all Y ,yields Ad Ad =ad + ✏ X , I+X − I X X k k 1 and because ✏ 2 X (I + X)− ,wehave k Xk k k limX 0 ✏X =0,whichshowsthatd(Ad)I(X)=adX; 7! that is, d(Ad)I =ad. The notation ad(X)(oradX)isalsousedinsteadadX. 130 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp The map ad is a linear map ad: gl(n, R) Hom(gl(n, R), gl(n, R)) ! called the adjoint representation of gl(n, R). One will check that ad([X, Y ]) = ad(X)ad(Y ) ad(Y )ad(X) − =[ad(X), ad(Y )], the Lie bracket on linear maps on gl(n, R). 3.1. ADJOINT REPRESENTATIONS Ad AND ad 131 This means that ad is a Lie algebra homomorphism. It can be checked that this property is equivalent to the following identity known as the Jacobi identity: [X, [Y, Z]] + [Z, [X, Y ]] + [Y, [Z, X]] = 0, for all X, Y, Z gl(n, R). 2 Note that ad = L R . X X − X Finally, we prove a formula relating Ad and ad through the exponential. 132 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp Proposition 3.1. For any X Mn(R)=gl(n, R), we have 2 1 (ad )k adX X Ad X = e = ; e k! Xk=0 that is, X X ad e Ye− = e X Y 1 1 = Y +[X, Y ]+ [X, [X, Y ]] + [X, [X, [X, Y ]]] 2! 3! + ··· for all X, Y Mn(R) 2 3.2. THE DERIVATIVE OF exp 133 3.2 The Derivative of exp It is also possible to find a formula for the derivative d expA of the exponential map at A,butthisisabit tricky. It can be shown that 1 ( 1)k d(exp) = eA − (ad )k, A (k +1)! A Xk=0 so 1 1 d(exp) (B)=eA B [A, B]+ [A, [A, B]] A − 2! 3! ✓ 1 [A, [A, [A, B]]] + . −4! ··· ◆ 134 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp It is customary to write ad id e− A − adA for the power series 1 ( 1)k − (ad )k, (k +1)! A Xk=0 and the formula for the derivative of exp is usually stated as ad id e− A d(exp) = eA − . A ad ✓ A ◆ 3.2. THE DERIVATIVE OF exp 135 The formula for the exponential tells us when the deriva- tive d(exp)A is invertible. Indeed, it is easy to see that if the eigenvalues of the ma- trix X are λ1,...,λn,thentheeigenvaluesofthematrix X k id e− 1 ( 1) − = − Xk X (k +1)! Xk=0 are λ 1 e− j − if λj =0,and1ifλj =0. λj 6 X id e− It follows that the matrix −X is invertible i↵no λj if of the form k2⇡i for some k Z 0 ,sod(exp)A is 2 −{ } invertible i↵no eigenvalue of adA is of the form k2⇡i for some k Z 0 . 2 −{ } 136 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp However, it can also be shown that if the eigenvalues of A are λ1,...,λn,thentheeigenvaluesofadA are the λi λj, with 1 i, j n. − In conclusion, d(exp)A is invertible i↵for all i, j we have λi λj = k2⇡i, k Z 0 . ( ) − 6 2 −{ } ⇤ This suggests defining the following subset (n)ofMn(R). E The set (n)consistsofallmatricesA Mn(R)whose E 2 eigenvalue λ + iµ of A (λ, µ R)lieinthehorizontal strip determined by the condition2 ⇡<µ<⇡. − Then, it is clear that the matrices in (n)satisfythe condition ( ), so d(exp) is invertible forE all A (n). ⇤ A 2E By the inverse function theorem, the exponential map is alocaldi↵eomorphismbetween (n)andexp( (n)). E E 3.2. THE DERIVATIVE OF exp 137 Remarkably, more is true: the exponential map is di↵eo- morphism between (n)andexp( (n)) (in particular, it is a bijection). E E This takes quite a bit of work to be proved. For example, see Mnemn´eand Testard [40]. We have the following result. Theorem 3.2. The restriction of the exponential map to (n) is a di↵eomorphism of (n) onto its image exp(E (n)). Furthermore, exp( (nE)) consists of all in- vertibleE matrices that have noE real negative eigenval- ues; it is an open subset of GL(n, R); it contains the open ball B(I,1) = A GL(n, R) A I < 1 , for every matrix norm{ 2 on n n matrices.|k − k } kk ⇥ Theorem 3.2 has some practical applications because there are algorithms for finding a real log of a matrix with no real negative eigenvalues; for more on applications of The- orem 3.2 to medical imaging, see Chapter ??. 138 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp.