<<

Chapter 3

Adjoint Representations and the Derivative of exp

3.1 The Adjoint Representations Ad and ad

Given any two vector spaces E and F ,recallthatthe of all linear maps from E to F is denoted by Hom(E,F).

The vector space of all invertible linear maps from E to itself is a group denoted GL(E).

When E = Rn,weoftendenoteGL(Rn)byGL(n, R) (and if E = Cn,weoftendenoteGL(Cn)byGL(n, C)).

The vector space Mn(R)ofalln n matrices is also ⇥ denoted by gl(n, R)(andMn(C)bygl(n, C)).

121 122 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp Then, GL(gl(n, R)) is the vector space of all invertible linear maps from gl(n, R)=Mn(R)toitself.

For any matrix A MA(R)(orA MA(C)), define the 2 2 maps LA :Mn(R) Mn(R)andRA :Mn(R) Mn(R) by ! !

LA(B)=AB, RA(B)=BA, for all B Mn(R). 2

Observe that LA RB = RB LA for all A, B Mn(R). 2 For any matrix A GL(n, R), let 2

AdA :Mn(R) Mn(R)(conjugationbyA) ! be given by

1 AdA(B)=ABA for all B Mn(R). 2

Observe that AdA = LA R 1 and that AdA is an A invertible with inverse Ad 1. A 3.1. ADJOINT REPRESENTATIONS Ad AND ad 123

The restriction of AdA to invertible matrices B GL(n, R) yields the map 2

AdA : GL(n, R) GL(n, R) ! also given by

1 AdA(B)=ABA for all B GL(n, R). 2

This time, observe that AdA is a group homomorphism (with respect to multiplication), since

1 AdA(BC)=ABCA 1 1 = ABA ACA = AdA(B)AdA(C).

In fact, AdA is a group isomorphism (because its inverse is Ad 1). A 124 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp

Beware that AdA is not alinearmaponGL(n, R)be- cause GL(n, R)isnotavectorspace!

However, GL(n, R)isanopensubsetofMn(R), because it is the complement of the set of singular matrices

A Mn(R) det(A)=0 , { 2 | } aclosedset,sinceitistheinverseimageoftheclosedset 0 by the function, which is continuous. { } 3.1. ADJOINT REPRESENTATIONS Ad AND ad 125

Since GL(n, R)isanopensubsetofMn(R), for every B GL(n, R), there is an open ball B(B,⌘) GL(n, R) 2 ✓ such that B + X B(B,⌘)forallX Mn(R)with X <⌘,soAd (2B + X)iswelldefinedand2 k k A

AdA(B + X) AdA(B) 1 1 1 = A(B + X)A ABA = AXA , which shows that d(AdA)B exists and is given by

1 d(AdA)B(X)=AXA , for all X Mn(R). 2 126 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp In particular, for B = I,weseethatthederivative d(AdA)I of AdA at I is a linear map of gl(n, R)=Mn(R) denoted by Ad(A)orAdA (or Ad A), and given by

1 AdA(X)=AXA for all X gl(n, R). 2

The inverse of AdA is AdA 1,soAdA GL(gl(n, R)). 2 Note that Ad =Ad Ad , AB A B so the map A Ad is a group homomorphism denoted 7! A Ad: GL(n, R) GL(gl(n, R)). ! The homomorphism Ad is called the adjoint representa- tion of GL(n, R).

We also would like to compute the derivative d(Ad)I of Ad at I. 3.1. ADJOINT REPRESENTATIONS Ad AND ad 127

For all X, Y Mn(R), with X small enough we have 2 k k I + X GL(n, R), and 2

AdI+X(Y ) AdI(Y ) (XY YX) 2 1 =(YX XY X)(I + X) .

Then, if we let

2 1 (YX XY X)(I + X) ✏(X, Y )= , X k k we proved that for X small enough k k

Ad (Y ) Ad (Y )=(XY YX)+✏(X, Y ) X , I+X I k k

1 with ✏(X, Y ) 2 X Y (I + X) ,andwith ✏(X, Yk)linearinkY . k kk k 128 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp

Let adX : gl(n, R) gl(n, R)bethelinearmapgivenby ! ad (Y )=XY YX =[X, Y ], X and ad be the linear map

ad: gl(n, R) Hom(gl(n, R), gl(n, R)) ! given by

ad(X)=adX.

We also define ✏X : gl(n, R) gl(n, R)asthelinearmap given by !

✏X(Y )=✏(X, Y ). If ✏ is the operator norm of ✏ ,wehave k Xk X

1 ✏X =max ✏(X, Y ) 2 X (I + X) . k k Y =1 k k k k k k 3.1. ADJOINT REPRESENTATIONS Ad AND ad 129 Then, the equation

Ad (Y ) Ad (Y )=(XY YX)+✏(X, Y ) X , I+X I k k which holds for all Y ,yields

Ad Ad =ad + ✏ X , I+X I X X k k

1 and because ✏ 2 X (I + X) ,wehave k Xk k k limX 0 ✏X =0,whichshowsthatd(Ad)I(X)=adX; 7! that is,

d(Ad)I =ad.

The notation ad(X)(oradX)isalsousedinsteadadX. 130 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp The map ad is a linear map

ad: gl(n, R) Hom(gl(n, R), gl(n, R)) ! called the adjoint representation of gl(n, R).

One will check that ad([X, Y ]) = ad(X)ad(Y ) ad(Y )ad(X) =[ad(X), ad(Y )], the Lie bracket on linear maps on gl(n, R). 3.1. ADJOINT REPRESENTATIONS Ad AND ad 131 This means that ad is a homomorphism. It can be checked that this property is equivalent to the following identity known as the Jacobi identity:

[X, [Y, Z]] + [Z, [X, Y ]] + [Y, [Z, X]] = 0, for all X, Y, Z gl(n, R). 2 Note that ad = L R . X X X Finally, we prove a formula relating Ad and ad through the exponential. 132 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp

Proposition 3.1. For any X Mn(R)=gl(n, R), we have 2 1 (ad )k adX X Ad X = e = ; e k! Xk=0 that is, X X ad e Ye = e X Y 1 1 = Y +[X, Y ]+ [X, [X, Y ]] + [X, [X, [X, Y ]]] 2! 3! + ··· for all X, Y Mn(R) 2 3.2. THE DERIVATIVE OF exp 133 3.2 The Derivative of exp

It is also possible to find a formula for the derivative d expA of the exponential map at A,butthisisabit tricky.

It can be shown that

1 ( 1)k d(exp) = eA (ad )k, A (k +1)! A Xk=0 so

1 1 d(exp) (B)=eA B [A, B]+ [A, [A, B]] A 2! 3! ✓ 1 [A, [A, [A, B]]] + . 4! ··· ◆ 134 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp It is customary to write

ad id e A adA for the power series

1 ( 1)k (ad )k, (k +1)! A Xk=0 and the formula for the derivative of exp is usually stated as

ad id e A d(exp) = eA . A ad ✓ A ◆ 3.2. THE DERIVATIVE OF exp 135 The formula for the exponential tells us when the deriva- tive d(exp)A is invertible.

Indeed, it is easy to see that if the eigenvalues of the ma- trix X are 1,...,n,thentheeigenvaluesofthematrix

X k id e 1 ( 1) = Xk X (k +1)! Xk=0 are

1 e j if j =0,and1ifj =0. j 6

X id e It follows that the matrix X is invertible i↵no j if of the form k2⇡i for some k Z 0 ,sod(exp)A is 2 { } invertible i↵no eigenvalue of adA is of the form k2⇡i for some k Z 0 . 2 { } 136 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp However, it can also be shown that if the eigenvalues of A are 1,...,n,thentheeigenvaluesofadA are the i j, with 1 i, j n.  

In conclusion, d(exp)A is invertible i↵for all i, j we have

i j = k2⇡i, k Z 0 . ( ) 6 2 { } ⇤

This suggests defining the following subset (n)ofMn(R). E

The set (n)consistsofallmatricesA Mn(R)whose E 2 eigenvalue + iµ of A (, µ R)lieinthehorizontal strip determined by the condition2 ⇡<µ<⇡. Then, it is clear that the matrices in (n)satisfythe condition ( ), so d(exp) is invertible forE all A (n). ⇤ A 2E By the inverse function theorem, the exponential map is alocaldi↵eomorphismbetween (n)andexp( (n)). E E 3.2. THE DERIVATIVE OF exp 137 Remarkably, more is true: the exponential map is di↵eo- morphism between (n)andexp( (n)) (in particular, it is a bijection). E E

This takes quite a bit of work to be proved. For example, see Mnemn´eand Testard [40]. We have the following result.

Theorem 3.2. The restriction of the exponential map to (n) is a di↵eomorphism of (n) onto its exp(E (n)). Furthermore, exp( (nE)) consists of all in- vertibleE matrices that have noE real negative eigenval- ues; it is an open subset of GL(n, R); it contains the open ball B(I,1) = A GL(n, R) A I < 1 , for every matrix norm{ 2 on n n matrices.|k k } kk ⇥

Theorem 3.2 has some practical applications because there are algorithms for finding a real log of a matrix with no real negative eigenvalues; for more on applications of The- orem 3.2 to medical imaging, see Chapter ??. 138 CHAPTER 3. ADJOINT REPRESENTATIONS AND THE DERIVATIVE OF exp