The Exponential Map
1
The exponential map
1.1 Vector fields and one-parameter groups of linear transformations1
A linear transformation X of the n-dimensional real or complex vector space E can be thought of as a vector field: X associates to each point p in E, the vector X(p). As an aid to one’s imagination one may thinkof a fluid in motion, so that the velocity of the fluid particles passing through the point p is always X(p); the vector field is then the current of the flow and the paths of the fluid particles are the trajectories. Of course, this type of flow is very special, in that the current X(p)atp is independent of time (stationary) and depends linearly on p. Figure 1 shows the trajectories of such flows in the plane. (We shall see later, that in appropriate linear coordinates every plane flow corresponds to exactly one of these pictures.) Consider the trajectory of the fluid particle that passes a certain point p0 at time τ = τ0.Ifp(τ) is its position at time τ, then its velocity at this time is p (τ)=dp(τ)/dτ. Since the current at p(τ) is supposed to be X(p(τ)), one finds that p(τ) satisfies the differential equation
p (τ)=X(p(τ)) (1) together with the initial condition
p(0) = p0. (2)
The solution of (1) and (2) may be obtained as follows. Try for p(τ)apower series in τ: ∞ k p(τ)= τ pk k=0
1Refer to the appendix to §1.1 for an explanation of the notation as necessary.
1 2 1. The exponential map
with coefficients pk ∈ E to be determined. Ignoring convergence question for the moment, eqn (1) becomes ∞ ∞ k−1 k k kτ pk = τ X (pk), k=1 k=0 which leads to 1 p = X(p ). k+1 k +1 k
These equations determine pk inductively in terms of p0: 1 p = X(p ). k k! 0 We write the solution p(τ) thus found in the form
p(τ) = exp(τX)p0, (3) where the exponential of a matrix X is defined by ∞ 1 exp X = Xk. k! k=0 This exponential function is a fundamental concept, which underlies just about everything to follow. First of all, the exponential series converges in norm for all matrices X, and (3) represents a genuine solution of (1) and (2). (See the appendix to this section.) We shall see momentarily that it is in fact the only differentiable solution, but first we shall prove some basic properties of the matrix exponential: Proposition 1. (a) For any matrix X, d exp τX = X exp(τX) = exp(τX)X, dτ and a(τ) = exp τX is the unique differentiable solution of a (τ)=Xa(τ),a(0)=1 (and also of a (τ)=a(τ)X, a(0)=1). (b) For any two commuting matrices X, Y , exp X exp Y = exp(X + Y ). (c) exp X is invertible for any matrix X and (exp X)−1 = exp(−X). (d) For any matrix X and scalars σ, τ exp(σ + τ)X = exp(σX) exp(τX), and a(τ) = exp τX is the unique differentiable solution of a(σ + τ)=a(σ)a(τ), a(0)=1, a (0) = X. 1.1. Vector fields and one-parameter groups 3
Proof. (a) Because of the norm-convergence of the power series for exp τX (appendix), we can differentiate it term-by-term:
d d ∞ τ j ∞ τ j−1 (exp τX)= Xj = Xj, dτ dτ j! (j − 1)! j=0 j=1 and this is X exp(τX) = exp(τX)X, as one sees by factoring out X from this series, either to the left or to the right. To prove the second assertion of (a), assume a(τ) satisfies
a (τ)=X(a(τ)),a(0)=1.
Differentiating according to the rule (ab) = a b + ab (appendix) we get d d d (exp(−τX)a(τ)) = exp −τX a(τ) + exp(−τX) a(τ) dτ dτ dτ = exp(−τX)(−X)a(τ) + exp(−τX)Xa(τ)=0.
So exp(−τX)a(τ) is independent of τ, and equals 1 for τ = 0, hence it equals 1 identically. The assertion will now follow from (c). (b) As for scalar series, the product of two norm-convergent matrix series can be computed by forming all possible products of terms of the first series with terms of the second and then summing in any order. If we apply this recipe to exp X exp Y , we get, ∞ Xj ∞ Y k ∞ XjY k exp X exp Y = = . (4) j! k! j!k! j=0 k=0 j,k=0
On the other hand, assuming X and Y commute, we can rearrange and collect terms while expanding (1/m!)(X + Y )m to get ∞ 1 ∞ 1 m! ∞ XjY k exp(X + Y )= (X + Y )m = XjY k = . m! m! j!k! j!k! m=0 m=0 j+k=m j,k=0 (5) Comparison of (4) and (5) gives (b). (c) This follows from (b) with Y = −X. (d) The first assertion comes from (b) with X replaced by σX and Y by τX. To prove the second assertion, assume a(τ) has the indicated property. Then