faculty of science mathematics and applied and engineering mathematics
On Yoshida’s Method For Constructing Explicit Symplectic Integrators For Separable Hamiltonian Systems
Bachelor’s Project Mathematics
July 2019
Student: J.C. Pim
First supervisor: Dr. M. Seri
Second assessor: Dr. A.E. Sterk
Abstract We describe Yoshida’s method for separable Hamiltonians H = T (p)+ V (q). Hamiltonians of this form occur frequently in classical mechanics, for example the n-body problem, as well as in other fields. The class of symplectic integrators constructed are explicit, reversible, of arbitrary even order, and have bounded energy growth. We give an introduction to Hamiltonian mechanics, symplectic geometry, and Lie theory. We com- pare the performance of these integrators to more commonly used methods such as Runge-Kutta, using the ideal pendulum and Kepler problem as examples.
2 Contents
1 Introduction 4
2 Preliminaries And Prerequisites 5 2.1 Manifolds, Vector Fields, And Differential Forms ...... 5 2.2 The Matrix Exponential ...... 6 2.3 The Vector Field Exponential ...... 7
3 Hamiltonian Mechanics 9 3.1 Motivating Examples ...... 10
4 Symplectic Geometry 15 4.1 Symplectic Vector Spaces ...... 15 4.2 Symplectic Manifolds ...... 17
5 Lie Theory And BCH Formula 21
6 Symplectic Integrators 28 6.1 Separable Hamiltonians ...... 28 6.2 Reversible Hamiltonians ...... 29 6.3 Yoshida’s Method For Separable Hamiltonians ...... 30 6.4 Properties Of The Yoshida Symplectic Integrators ...... 38 6.5 Backward Error Analysis ...... 38
7 Numerical Simulations 43 7.1 The Ideal Pendulum ...... 43 7.2 The Kepler Problem ...... 46
8 Conclusion 50
References 51
3 1 Introduction
Many interesting phenomena in science can be modelled by Hamiltonian sys- tems, for example the n-body problem, oscillators, problems in molecular dy- namics, and models of electric circuits [11]. In general these are defined by systems of differential equations that cannot be solved analytically and very of- ten numerical methods are needed in order to study their behaviour. This gives rise to some issues: many of these systems have invariants such as conservation of energy or angular momentum which numerical methods do not necessarily preserve, sometimes with drastic consequences on the evolution of the computed solution. In the case of chaotic systems, like the n-body problem, this is wors- ened by the fact that the perturbations introduced by numerical methods may lead to wildly different solutions over long time scales. This is also undesirable in many applications such as modelling the trajectory of spacecraft or other small bodies, Hamiltonian Monte-Carlo methods, and physics-based animation used in the production of video games, and movies [2]. A way to mitigate these problems is to use high order numerical methods with a sufficiently small step size, but this can be extremely computationally expensive. Also, techniques such as adaptive time-stepping may not preserve important features of the system’s dynamics. For example, in neuronal dy- namics [3] the model’s limit cycles have an important role in neuronal spiking. However, these are not well preserved by commonly used “Euler-like” methods. Hence, we would like to have numerical methods which preserve at least some of the system’s invariants. We can do this by exploiting the underling geometry of Hamiltonian systems, called symplectic geometry. The natural space for the Hamiltonian dynamics, the so-called phase space, is endowed with a canonical differential 2-form, called the symplectic form, which the flow of the Hamilto- nian system preserves. This can be used to derive numerical methods, called symplectic integrators, which themselves preserve the symplectic form. In this bachelor project, we aim to describe Yoshida’s method for construct- ing explicit symplectic integrators for separable Hamiltonian systems. The class of integrators we construct are reversible, of arbitrary even order and have bounded energy growth. The construction makes use of Lie algebras and the Baker-Campbell-Hausdorff (BCH) formula. However, we will first describe the basic theory of Hamiltonian systems and their geometric properties using sym- plectic geometry. We will study some examples of Hamiltonian systems includ- ing the ideal pendulum and the Kepler problem, which will act as our main motivation for developing symplectic integrators. Then, we will give an intro- duction to Lie theory, before explaining Yoshida’s construction, and showing some of its important properties. This will include a backward error analysis to show that this class of integrators has bounded energy growth. Finally, we will implement and compare some integrators from this class to more commonly used methods, such as Runge-Kutta, by applying them to our motivating examples.
4 2 Preliminaries And Prerequisites
We will briefly describe and recall some of the theory used in later sections. We discuss some concepts from differential geometry such as manifolds, vector fields and differential forms, which will be useful later. We then review some facts about the matrix exponential, and give some intuition for the vector field exponential.
2.1 Manifolds, Vector Fields, And Differential Forms In later sections of this thesis we will make use of many concepts from differential geometry, especially in Section 4 and Section 5. The details about any of what follows, can be found in any book on differential geometry such as [14]. We will mainly use, though not discuss in detail: manifolds, smooth maps between manifolds, vector fields and differential forms on manifolds. In general, it will be sufficient to consider manifolds as open subsets of, or as surfaces in Rn. When we say smooth it is safe to read this as: of class C∞, though many results also hold for Ck with k ≥ 1. We will write x1, . . . , xm for the local coordinates on a smooth manifold M ∂ ∂ of dimension m. We use ∂x1 ,..., ∂xm as the basis of the tangent space TpM 1 m ∗ for p ∈ M, and dx ,..., dx as the basis of the cotangent space Tp M. We denote the differential of a smooth map F : M → N between M and another manifold N, as F∗ : TpM → TF (p)N at p ∈ M. For a tangent vector Xp ∈ TpM we define it pointwise as F∗(Xp)f = Xp(f ◦ F ) where f is a smooth function on M. It can also be computed using a smooth curve c(t) on M with c(0) = p 0 d and c (0) = Xp as F∗(Xp) = dt t=0(F ◦ c)(t). Pm ∂ A vector field X = i=1 ai ∂xi is called smooth if its coefficients ai are smooth functions on M. Smooth vector fields have a flow Φ: I × M → M, where I is an interval. Definition 1 A smooth vector field on a manifold is called complete if its flow is defined for all time.
That is, if the vector field’s flow Φt has I = R. We will often make use of complete vector fields as they simplify the theory in places. We interpret vector fields as differential operators, and thus they can be applied to smooth functions. Pm i Similarly a differential 1-form or covector field on M is given by η = i=1 aidx and is again called smooth if the ai are smooth. Differential k-forms for k > 1 can be written as a sum of wedge products of the basis 1-forms: dx1,..., dxm, 1 2 2 3 for example dx ∧dx +5dx ∧dx . The 0-forms are functions on M. We use ηp ∗ for p ∈ M to denote it as a covector in Tp M. We can compute the application 1 k of a k-form dx ∧ · · · ∧ dx to k tangent vectors v1, . . . , vk as 1 1 dx (v1) ··· dx (vk) 2 2 dx (v1) ··· dx (vk) dx1 ∧ · · · ∧ dxk(v , . . . , v ) = det . 1 k . . k k dx (v1) ··· dx (vk)
5 The exterior derivative d transforms k-forms into (k+1)-forms. The exterior ∂f 1 ∂f m derivative of a smooth function f is given by df = ∂x1 dx + ··· + ∂xm dx . Since covectors “consume” tangent vectors, we can see that the differential f∗ and exterior derivative df of a function f are in fact the same. A k-form η is called closed if dη = 0, while it is called exact if there exists a (k − 1)-form µ s.t. dµ = η. Finally, the pullback of a k-form η at p ∈ M by a smooth function ∗ F is given by F ηp(v1, . . . , vk) = ηF (p)(F (v1),...,F (vn)) for v1, . . . , vn ∈ TpM. We will also make use of the following results. Theorem 1 (Regular Level Set Theorem[14, Theorem 9.9, pg. 105]) Let N and M be smooth manifolds of dimension n and m respectively. Let F : N → M be a smooth map. If c ∈ M is a regular value of F s.t. F −1(c) 6= ∅, then F −1(c) is a regular submanifold of N of dimension n − m.
Theorem 2 ([14, Theorem 11.15, pg. 124]) Let f : N → M be a smooth map of manifolds, and let f(N) ⊂ S ⊂ M. If S is a regular submanifold of M, then the map i ◦ f = f˜: N → S induced by the inclusion map i: S → M, is smooth.
Theorem 3 ([14, Proposition 14.3, pg. 151]) A vector field X on a man- ifold M is smooth if and only Xf ∈ C∞(M) for all f ∈ C∞(M).
2.2 The Matrix Exponential We will briefly review some facts about the matrix exponential which will be useful when we discuss Lie theory and the BCH formula. Proofs of these facts can be found in books on ordinary differential equations such as [1, Chapter 3] or [14, Section 15.3 and 15.4].
Definition 2 For X ∈ Rn×n the matrix exponential eX or exp(X) is defined by: 1 1 eX = I + X + X2 + X3 + .... 2! 3!
It is well defined as the series converges absolutely for all X ∈ Rn×n. Theorem 4 For X,Y ∈ Rn×n, if X and Y commute, then eX eY = eX+Y . It’s natural to wonder whether the reverse of this statement holds, that is can we find W s.t. eX eY = eW . This is answered by the BCH formula (Theorem 17). Theorem 5 The trace of a matrix satisfies:
• For X,Y ∈ Rn×n, tr(XY ) = tr(YX). n×n −1 • For X,Y ∈ R s.t. det Y 6= 0, tr(X) = tr YXY . • For X ∈ Rn×n, det eX = etr X .
n×n d tX tX tX Theorem 6 For X ∈ R , dt e = Xe = e X.
6 This is interesting as for real matrices X, tr X ∈ R, and ∀x ∈ R, ex ≥ 0, hence by the above theorem det eX 6= 0. We fruitfully exploit this method of constructing non-singular matrices from arbitrary real matrices, as it allows us to explicitly construct curves in GL(n, R). Using these we can more easily compute the differentials of smooth maps on GL(n, R), which will be use in Section 5. Consider the curve
tX n×n c: R → GL(n, R); t 7→ Ae , for X ∈ R , and A ∈ GL(n, R).
It has initial point c(0) = A ∈ GL(n, R), and initial “velocity” d c0(0) = etX A| = AX ∈ n×n = T GL(n, ). dt t=0 R A R To show the utility of such curves, we compute the the differential of the de- terminant map, det: GL(n, R) → R at A ∈ GL(n, R). GL(n, R) is an open n×n n×n subset of R , thus we can identify TAGL(n, R) with R , see Example 5. −1 Consider the map c(t) = AetA X , for X ∈ Rn×n. By the above c(0) = A and c0(0) = AA−1X = X. Then
d tA−1X d tA−1X det∗(X) = det Ae = det(A) det e dt t=0 dt t=0 d −1 = det(A) et tr(A X) dt t=0 = det(A) tr A−1X.
2.3 The Vector Field Exponential We will give a brief introduction to the vector field exponential. We will show that it is a generalization of the matrix exponential and solves the differential equation induced by the vector field. Recall that the flow of the scalar equation ta y˙(t) = ay(t), a ∈ R is given by Φt(u) = e u. Notice that, Theorem 6 implies the matrix exponential etA satisfies the equationy ˙(t) = Ay(t) where A ∈ Rn×n, tA thus the flow of such a system is given by Φt(u) = e u. Consider a more general case, let U ⊂ Rn be open, and let X¯ : U → Rn be a complete smooth vector field. We can interpret X¯ as the derivation given
Pn ¯ ∂ ¯ pointwise by Xp = i=1 Xi(p) ∂xi where the Xi are smooth functions on U. p X also induces a linear operator on the space of smooth functions on U. Of course, X¯ and X are the same object, and we will use X for both in later sections, but distinguishing them will be somewhat helpful here. Now, X¯ also induces the differential equationy ˙(t) = X(y(t)). Comparing both X¯ and X we see n n X X ∂ X(y(t)) = X¯ (y(t))e ↔ X¯ = X, i i i ∂xi i=1 i=1 n where { e1, . . . , en } is a basis of R .
7 By assumption the vector field X¯ is complete, thus it has a flow Φt which is diffeomorphism of U for each t ∈ R, and is smooth since X is smooth. Recall that d ¯ the flow has the properties: Φ0 = idU ,Φt ◦Φs = Φt+s, and dt Φt(u) = X(Φt(u)). ∞ Assume that X and hence X¯ are C , and that for each p ∈ U,Φt(p) is real analytic in t, in a neighbourhood of 0. Hence, there is some r > 0 s.t.
∞ k X t (k) Φ (p) = Φ (p) for all t ∈ (−r, r). t k! 0 k=0 By the chain rule we have that d Φ(0)(p) = id (Φ (p)) = Φ (p) and Φ(1)(p) = Φ (p) = X¯(Φ (p)), t U t t t dt t t i ¯ ¯ where Φt and Xi are the i-th components of Φt and X respectively. Now assume (k) k−1 ¯ for k ≥ 1 that Φt (p) = (X X)(Φt(p)) with this being the composition of the linear operator X applied to the function X¯, then d d Φ(k+1)(p) = Φ(k)(p) = Xk−1X¯(Φ (p)) t dt t dt t n X ∂ k−1 ¯ d i = X X Φt(p) ∂xi Φ (p) dt i=1 t n X ∂ k−1 = X¯i(Φt(p)) (X X¯) ∂xi Φ (p) i=1 t k = (X X¯)(Φt(p)).
(n+1) n ¯ Hence by induction, we have that Φt (p) = X X(Φt(p)), and in particular (n+1) n ¯ that Φ0 (p) = (X X)(p). Therefore,
∞ X tk Φ (p) = p + (Xk−1X¯)(p) := etX p. t k! k=1 This has the same form as the series form of the matrix and scalar exponentials since X and X¯ are interpretations of the same vector field. Thus we call etX := Φt the vector field exponential. When Φt is not analytic we can no longer expand it as an infinite series. We can however still approximate it with a Taylor polynomial. Assuming Φt(p) is of class C`, this approximation along with the above computations gives that n for each integer ` ≥ n ≥ 1, there exists some function Rn : R → R s.t.
n k n k X t (k) X t Φ (p) = Φ (p) + R (t) = p + (Xk−1X¯)(u) + R (t) t k! 0 n k! n k=0 k=1
n+1 n ¯ with limt→0 Rn(t) = 0. This expansion along with Φt (p) = (X X)(Φt(u)), tX is enough for us to write e := Φt.
8 3 Hamiltonian Mechanics
In this section we aim to give an introduction to Hamiltonian mechanics in the familiar setting of R2n. We will consider only autonomous Hamiltonians, i.e. models of energy conserving systems. We will study some examples of Hamiltonian systems such as the ideal pendulum and the Kepler problem. These example will act, both as motivation for the abstraction to symplectic manifolds, which we discuss in Section 4, and as test cases for the numerical integrators we construct in Section 6. We will not yet prove any properties of the flows of Hamiltonian systems, such as the conservation of energy. We instead choose to leave them for later when we can employ the Lie derivative and other tools of differential geometry. It is well known that energy is an important concept in physics. The idea of Hamiltonian mechanics is to use the total energy of a system to define the dynamics of that system. For such energy conservative systems, Hamiltonian mechanics is an equivalent reformulation of classical Newtonian mechanics where the dynamics are directly derived from Newton’s formula F = ma. It is in some sense “dual” to the Lagrangian formulation through conjugate momentum and the Legendre transform. However, for dissipative systems the situation is more complicated. We will neither discuss these reformulations nor show how they are equivalent (the details can be found in books on classical mechanics, or in [6, Chapter VI.]).
Definition 3 Let M ⊂ R2n be open, and H ∈ C2(M, R). 0 −In 2n×2n x˙ = XH (x) := J∇H(x) where J = ∈ R In 0 is called the Hamiltonian differential equation, while XH is called the Hamiltonian vector field, and H is the Hamiltonian.
2n n n Giving R = R × R the coorindates (p, q) = (p1, . . . , pn, q1, . . . , qn), the Hamiltonian differential equations take the form ∂H ∂H p˙ = − andq ˙ = , (1) ∂q ∂p h iT wherep ˙ = dp and ∂H = ∂H , ··· , ∂H . This “twisting”, where the time dt ∂p ∂p1 ∂pn derivatives of one variable depend on the special derivatives of H w.r.t. another variable is where the special structure of Hamiltonian dynamics comes from. We call p the momentum, and q the position. The Hamiltonian is also called the total energy of the system, while n is the number of degrees of freedom, and M is the phase space. Hamiltonians are often of the form H(p, q) = T (p)+V (q), where T is called the kinetic energy and V the potential energy. Such Hamiltonians are called a separable Hamiltonian. The Hamiltonian vector field for these splits as XH = XT + XV with XT depending only on p and XV only on q. This is precisely the property we will exploit in Section 6 to construct symplectic integrators.
9 The Hamiltonian vector field (1) can be written as derivation
n n X ∂H ∂ X ∂H ∂ X = − + . H ∂q ∂p ∂p ∂q i=1 i i i=1 i i
We see that the flow of a Hamiltonian vector field XH is given formally by the vector field exponential exp(tXH ). With the initial condition (p0, q0) ∈ M, the formal solution of the Hamiltonian equations becomes exp{tXH }(p0, q0). Given a vector field X how do we determine whether it is a Hamiltonian vector field? We know that for a Hamiltonian H, XH = J∇H, thus ∇H = −JXH . Determining whether a vector field is the gradient of some function is relatively easy, at least for open and simply connected subsets of Rn. Theorem 7 Let M ⊂ Rn be open and simply connected, and f ∈ C1(M, Rn). Then, f is a gradient vector field if and only if the Jacobian of f is symmetric.
Proof n To the vector field f = (f1, . . . , fn): M → R we associate the differential 1- 1 n form ωf = f1dx +···+fndx . Notice that under this association f is a gradient vector field if and only if ωf is exact. Since M is open and simply connected, Poincar´e’sLemma [14, Corollary 27.13, pg. 300] implies that ωf is exact if and only if ωf is closed. We compute dωf :
n n n X X X ∂fi dω = df ∧ dxi = dxj ∧ dxi f i ∂xj i=1 i=1 j=1 n n X X ∂fi = dxj ∧ dxi ∂xj i=1 j=1 X ∂fj ∂fi = − dxi ∧ dxj ∂xi ∂xj 1≤i ∂fj ∂fi It is clear that dωf = 0 ⇐⇒ ∂xi = ∂xj for i, j = 1, . . . , n. Therefore f is a gradient vector field ⇐⇒ ωf is closed ⇐⇒ the Jacobian of f is symmetric. 3.1 Motivating Examples We study some examples of Hamiltonians systems. Example 1 (Linear Hamiltonian Systems) A linear vector field on R2n is given by f(x) = Bx for B ∈ R2n×2n. f will be a Hamiltonian vector field if −JBx is a gradient vector field. By Theorem 7, this is precisely when its Jacobian −JB is symmetric. Hence, we want to find a function H s.t. ∇H = −JBx. Recall, for A ∈ Rn×n we have that ∇ xT Ax = AT x + Ax = (A + AT )x, 10 where A + AT is symmetric. Thus, if JB is symmetric, f is the Hamiltonian vector field generated by 1 H : 2n → ; x 7→ H + xT (− B)x, R R 0 2 J where H0 ∈ R. Therefore, all linear Hamiltonian systems are generated by Hamiltonians of the form 1 H : 2n → ; x 7→ H + xT Ax, (2) R R 0 2 2n×2n T where A ∈ Sym(2n, R) := { X ∈ R | X = X }, and H0 ∈ R. However, these Hamiltonians will not, in general, be separable. An easy computation shows that for a linear Hamiltonian to be separable, A must be of the form TD n×n −DT V where T,V,D ∈ R and D is skew-symmetric. Notice (2) generates Hamiltonian systems of the form x˙ = JAx, (3) who’s system matrices JA have the property T T T (JA) J + J(JA) = A J J − I2nA = A(−I2n) − A = A − A = 0. We known the flow of (3) is given by the vector field exponential which reduces to the matrix exponential since the vector field is linear. Thus the flow is 2n 2n t A R × R → R ;(t, x) 7→ e J x. Example 2 (Ideal Pendulum) A non-linear Hamiltonian system who’s dy- namics are still relatively easy to describe is the ideal pendulum (see Figure 1). Here, q is the angular displacment and p is the momentum which is the velocity q˙ in this case since m = 1. Its Hamiltonian is given by 1 H(p, q) = p2 − cos q, 2 which is clearly separable and generates the dynamics ∂H ∂H p˙ = − = − sin q andq ˙ = = p. ∂q ∂p We can see that these dynamics are the same as those derived from Newton’s equation F = ma which in this case givesq ¨ = − sin q. The Hamiltonian is constant along solution curves of the system, thus they lie in the energy shells H−1(c) for c ∈ R. Examining the energy surface in Figure 1, we can clearly see that the equilibrium points are given by (0, kπ) for k ∈ Z. The stable and unstable equilibria are the basins and saddle points respectively of this surface (i.e. (0, 2kπ) and (0, 2kπ+1) for k ∈ Z). These can be analytically determined by computing the eigenvalues of the linearized system around these points. We see that the dynamics change at H = 1, between periodic motion of q when H < 1 and monotonic motion of q when H > 1. 11 Figure 1: Ideal pendulum of length l = 1, and mass m = 1 (left). Energy surface of the ideal pendulum with solution curves for E = 2, E = 1, and E = 0 (right). Example 3 (The Kepler Problem) Consider two bodies which are attracted to each other gravitationally, for example a sun and planet. If we denote their positions with qs and q, and their momentums with ps and p, their Hamiltonian is 2 2 kpsk kpk GmM H(p, ps, q, qs) = + − , 2M 2m kq − qsk where M and m are the masses of the sun and planet, while G is the gravitational 3 constant, and p, ps, q, qs ∈ R . The norm is Euclidean 2-norm. This Hamiltonian generates the dynamics ∂H q − qs ∂H p p˙ = − = −GmM 3 , q˙ = = ∂q kq − qsk ∂p m and ∂H q − qs ∂H ps p˙s = − = GmM 3 , q˙s = = . ∂qs kq − qsk ∂p s m p p Using Heliocentric coordinates Q = q − qs we get that Q˙ = /m − s/M, and p˙ p˙ Q Q¨ = − s = −G(M + m) . m M kQk3 We also have d G(M + m) Q × Q˙ = Q˙ × Q˙ + Q × Q¨ = − (Q × Q) = 0. dt kQk3 12 T 3 ˙ This implies Q(t) is constrained to the plane x ∈ R | Q(0) × Q(0) x = 0 for all t ∈ R. Therefore, we can choose a basis of the plane, consider Q ∈ R2 and take a suitable rescalling of units, to get that the simplified dynamics are given by Q P˙ = Q¨ = − , Q˙ =: P. (4) kQk3 These dynamics are generated by the separable Hamiltonian kP k2 1 H(P,Q) = − . (5) 2 kQk d A quick computation shows that dt L = 0 where L is the angular momentum L(p1, p2, q1, q2) = q1p2 − q2p1, with P = (p1, p2) and Q = (q1, q2). Hence both H and L are conserved along solutions curves of (4). Thus, a solution (P,Q) of (4) with initial point (P (0),Q(0)) = (P0,Q0) has H(P (t),Q(t)) = H(P0,Q0) = H0 ∈ R and L(P (t),Q(t)) = L(P0,Q0) = L0 ∈ R. In polar coordinates (q1, q2) = (r cos θ, r sin θ), we have ˙ ˙ (p1, p2) = (q ˙1, q˙2) = r˙ cos θ − rθ sin θ, r˙ sin θ + rθ cos θ , which gives the expressions 1 1 H = r˙2 + r2θ˙2 − , and L = r2θ.˙ 0 2 r 0 ∂r ˙ Considering r as a function of θ, we seer ˙ = ∂θ θ by the chain rule , and ! L2 ∂r 2 1 H = 0 + r2 − . 0 2r4 ∂θ r From this we get 2 4 3 2 2 p 2 2 ∂r 2r H0 2r r L0 ∂r r 2r H0 + 2r − L0 = 2 + 2 − 2 =⇒ = ± . ∂θ L0 L0 L0 ∂θ L0 By separation of variables, and other computations it possible to see Z Z Z 2 L0 L0 θ − θ0 = dθ = dr = dr p 2 2 p 2 2 2 r 2r H0 + 2r − L0 r e r − (L0 − r) 1L2 = arccos 0 − 1 , e r 13 p 2 where e := 1 + 2H0L0 is called the eccentricity. Therefore, L2 r(θ) = 0 , 1 + e cos(θ − θ0) where θ0 is determined by the initial conditions r(θ(0)) and θ(0). These are in turn specified by q1(0) and q2(0) using q1(0) = r(θ(0)) cos(θ(0)), and q2(0) = r(θ(0)) sin(θ(0)). Therefore we can explicitly compute 1 L2 θ = θ(0) − arccos 0 − 1 . 0 e r(θ(0)) For L0 6= 0, r(θ) is a conic section, which depending on H0 gives the following types of trajectories for the Kepler problem: • For H0 < 0, we have e ∈ [0, 1) which are ellipses in general and a circle if e = 0. • For H0 = 0, we have e = 1 which is a parabola. • For H0 = 0, we have e = (1, ∞) which are hyperbola. For example, the trajectory of P0 = (0, 1.35), Q0 = (1, 0) is an ellipse, while that of P0 = (0, 1.5), Q0 = (1, 0) is a hyperbola (see Figure 2). Figure 2: Ellipse generated with r(θ) using H0 = −0.08875, L0 = 1.35, e = 0.8225, and θ0 = 0 (left). Hyperbola generated with r(θ) using H0 = 0.125, L0 = 1.5, e = 1.25, and θ0 = 0 (right). 14 4 Symplectic Geometry To gain greater insight into the behaviour and properties of Hamiltonian sys- tems, we will take a more abstract perspective by considering their phase spaces as manifolds rather than just as open subsets of R2n. This allows us to employ the tools of differential geometry. These more clearly display the geometric structure of these properties. We equip the phases space with an additional structure, a differential 2-form which defines the dynamics that a Hamiltonian generates. Further explanation of why symplectic geometry is the natural set- ting of classical mechanics can be found in Cohn’s essay [5]. However, to begin we will study symplectic vector space which will make the jump to abstract symplectic manifolds easier. 4.1 Symplectic Vector Spaces Let V be a real vector space of dimension n. Definition 4 A function ω : V × V → R is called bilinear or a 2-form if ω is linear in both of its arguments. • ω is anti-symmetric if for all a, b ∈ V, ω(a, b) = −ω(b, a). • ω is non-degenerate if for all non-zero a ∈ V , ∃b ∈ V s.t. ω(a, b) 6= 0. • ω is a symplectic form if it anti-symmetric and non-degenerate. Then the pair (V, ω) is called a symplectic vector space. A symmetric and positive definite 2-form on V is an inner product on V . We can think of a symplectic form as an anti-symmetric analogue to an inner product. We can also see that an anti-symmetric 2-form is a differential 2-form on V since ∼ TpV = V for p ∈ V . Definition 5 Let { v1, . . . , vn } be a basis of V . The matrix representation n×n J ∈ R of a 2-form ω w.r.t. that basis, is given by (J)ij = ω(vi, vj) for i, j = 1, . . . , n. The rank of ω is the rank of its matrix representation J. The matrix representation of an anti-symmetric 2-form can be put in a standard form by constructing a suitable basis of the vector space. Theorem 8 (Darboux Theorem: Linear Case) Let ω be an anti-symmetric 2-form of rank r. Then r = 2m for some m ≥ 0, and there exists a basis of V such that the matrix representation of ω has the form 0 −Im 0 J = Im 0 0 . 0 0 0 15 Proof Assume that ω 6= 0, otherwise we’re done. So there exists vb1, vbm+1 ∈ V v s.t. c1 := ω(vbm+1, vb1) 6= 0. Let v1 := b1/c1 and vm+1 = vbm+1. Since ω is anti-symmetric ω(v1, v1) = ω(vm+1, vm+1) = 0 and ω(v1, vm+1) = −ω(vm+1, v1) = −1. Let U1 = span {v1, vm+1} ⊂ V and V2 := { p ∈ V | ω(p, q) = 0 ∀q ∈ U1 }. Notice that U1 ∩ V2 = {0} and define y := x + ω(v1, x)vm+1 − ω(vm+1, x)v1 for x ∈ V . Then ω(v1, y) = ω(v1, x + ω(v1, x)vm+1 − ω(vm+1, x)v1) = ω(v1, x) + ω(v1, x)ω(v1, vm+1) − ω(vm+1, x)ω(v1, v1) = ω(v1, x) − ω(v1, x) = 0, and similarly ω(vm+1, y) = 0. Thus, y ∈ V2 which implies U1 + V2 = V . We can now consider ω|V2 and repeat this process using V2 instead of V . By induction 0 this gives us a basis { v1, . . . , v2m } for a subspace V of V of dimension 2m. If 0 dim V = n > 2m, we can extended this basis for V with e1, . . . , e2m−n ∈ V to get a basis of V . Since ω has rank 2m we have that ω(ek, v) = 0 for all v ∈ V . With respect to the basis { v1, . . . , v2m, e1 . . . , e2m−n } the matrix representation of ω is given by 1, if i = m + j (J)ij = −1, if j = m + i 0, otherwise for i, j = 1,..., 2m and (J)ij = 0 for i, j > 2m. We can use this theorem to derive two useful facts about symplectic vector spaces. Corollary 1 Symplectic vector spaces are even dimensional. Proof Let (V, ω) be a symplectic vector space. ω is non-degenerative, hence by Theorem 8 dim V = rank ω = 2m for some m ≥ 0. Corollary 2 A symplectic vector space (V, ω) has a basis {p1, . . . , pm, q1, . . . , qm} Pm i i s.t. ω has the standard form i=1 dp ∧ dq . Proof From dim V = 2m and the proof of Theorem 8, setting pi := vi and qi := vm+i gives us the required basis. Looking at J, the matrix representation Pm i i of ω w.r.t. this basis, shows us that ω is of the form i=1 dp ∧ dq . We can use this corollary to construct a volume form on a symplectic vector space. By the above results it suffices to show that on a symplectic vector space (V, ω) of dimension 2m, the 2-form ω defines a volume form ωm = ω ∧ ... ∧ ω 16 i.e. ωm is a 2m-form s.t. ωm 6= 0. Notice that m ! m m X i i X j j ω = dp ∧ dq ∧ ... dp ∧ dq i=1 j=1 = n!dp1 ∧ dq1 ∧ ... ∧ dpm ∧ dqm. Hence, ωm 6= 0 so it is a volume form on V . Just as isometries between inner product spaces preserve the inner product, we define maps which preserve the symplectic structure of a vector space. Let W be another finite dimensional R-vector space. Definition 6 Let η be a 2-form on W , and f ∈ L(V,W ) a linear map form V to W . The function f is called a symplectic if f preserves the symplectic form i.e. if for all a, b ∈ V (f ∗η)(a, b) = η(f(a), f(b)) = ω(a, b). Thus the symplectic maps f ∈ L(V,V ) are those which satisfy f ∗ω = ω. If we 0 −I choose a basis of V s.t. the matrix representation of ω has the form J = I 0 and let A be the matrix representation of f, this condition becomes AT JA = J. The set of all symplectic maps on V becomes a group under composition, called the symplectic group. It is shown to be a Lie group in Example 6. 4.2 Symplectic Manifolds We will now generalise the concept of a symplectic vector space to that of a symplectic manifold. We will see how the phase spaces of Hamiltonians systems form symplectic manifolds, and are thus the natural space for Hamiltonian dy- namics. Using the Lie derivative, an operation intrinsic to manifolds, we will show two important properties of the Hamiltonian flow: it preserves the Hamil- tonian function, and it preserves the symplectic form on the manifold. We begin by generalising symplectic forms from vector spaces to manifolds. Definition 7 A differential 2-form ω on a smooth manifold M is called a sym- plectic form if ω is closed, and for each p ∈ M, ωp is a symplectic form on TpM. The pair (M, ω) is called a symplectic manifold. A symplectic manifold (M, ω) is even dimensional. This is due to ωp being a symplectic form on TpM for p ∈ M. Recall dim TpM = dim M, and that TpM is even dimension by Corollary 3. A symplectic form ω on manifold M also defines a volume form on M by m similar argument. Let dim M = 2m. For p ∈ M, ωp is a volume form on TpM by Corollary 2. Therefore, ωm 6= 0, implying ωm is a volume form and that symplectic manifolds are orientable. Consider a smooth manifold N of dimension n. The tangent and cotangent bundles of N, TN and T ∗N respectively are both manifolds of dimension 2n. 17 Thus, both satisfy the even dimensionality requirement, and could be equipped with a symplectic form. The cotangent bundle is used as the phase space for the Hamiltonian formalism while the Lagrangian formalism uses the tangent bundle. Therefore, in the rest of this section we will consider symplectic manifolds of the form M := T ∗N. We write q = (q1, . . . , qn) for the local coordinates of N. This induces local coordinates (p, q) = (p1, . . . , pn, q1, . . . , qn) on T ∗M giving ∗ 1 1 n n p ∈ Tq M which implies p = p dq + ··· + p dq . With these coordinates we can equip M with the symplectic form n X i i ω0 = dp ∧ dq . i=1 This can be derived in a coordinate free way as the exterior derivative of the Liouville form (see [14, Example 17.4, pg. 193] or [9, Section 10.1, pg. 218]). Hence, it is called the canonical symplectic form. We define interior multiplication of a differential form and use it to define the dynamics generated by a Hamiltonian, as well as the Lie derivative. Definition 8 Let X be a smooth vector field on a manifold N. For k ≥ 2, the interior multiplication or contraction of a k-form ω with X is a (k−1)-form defined by ιX ω(X2,...,Xk) := ω(X,X2,...,Xk). We define ιX ω = ω(X) for a 1-form ω and ιX ω = 0 for a 0-form ω i.e. a function. Lemma 1 Let X be a smooth vector field on a manifold N. Then ιX ◦ ιX = 0. Proof Let ω be a k-form on N. ιX ◦ ιX ω(X3,...,Xk) = ιX ω(X,X3,...,Xk) = ω(X,X,X3,...,Xk) = 0, as ω is alternating. Definition 9 The vector field XH defined by ιXH ω := ω(XH , ·) = dH for a smooth function H : M → R on a symplectic manifold (M, ω), is called the Hamiltonian vector field generated by H. The triple (M, ω, H) is called a Hamiltonian system. This definition extends our previous definition of Hamiltonian systems on Rn. ∗ n Example 4 Consider the symplectic manifold (M, ω) = (T R , ω0) with coor- dinates (p, q), and the vector field n n X ∂ X ∂ X = X i + X i . p ∂pi q ∂qi i=1 i=1 Given a Hamiltonian H : M → R, we determine the coefficients of X s.t. it is the Hamiltonian vector field generated by H. Now, n n X i i X i i ιX ω0 = dp ∧ dq (X, ·) = Xqi dp − Xpi dq , i=1 i=1 18 and n X ∂H ∂H dH = dpi + dqi . ∂pi ∂qi i=1 We can now solve ιX ω0 = dH be comparing coefficients to see that ∂H ∂H X i = − and X i = p ∂qi q ∂pi for i = 1, . . . , n. This is agrees with our previous definition of the Hamiltonian vector field (see Definition 3). Another operation intrinsic to manifolds and related to interior multiplica- tion is the Lie derivative of a differential form. Geometrically it is how the form changes along the vector field. Definition 10 Let X be a smooth vector field on a manifold N. The Lie derivative of a differential k-form ω along X is the k-form defined by the Cartan homotopy formula LX ω := (ιX d + dιX )ω. It is well defined since for a k-form ω the exterior derivative raises it to a k + 1- form while interior multiplication lowers it again to a k-form. A smooth function f : N → R is a 0-form, thus its Lie derivative along X is LX f = ιX df + dιX f = df(X) = Xf, which is the direction derivative of f along X. Furthermore, the Lie derivative along X interacts nicely with time derivative of the flow of X. Theorem 9 ([9, Theorem B.34, pg. 521]) Let X be a smooth vector field on a manifold N, and let Φ: R × M → M be the flow generated by X. Then for t ∈ R, each differential form ω on N satisfies d (Φ∗ω) = Φ∗L ω. dt t t X For more details about the Lie derivative or interior multiplication see [14, Section 20] or [9, Section B.5]. We have now built up enough machinery to prove two properties of the Hamiltonian flow. Theorem 10 The flow Φ: R × M → M of a Hamiltonian system (M, ω, H) preserves H. Proof d d H(Φ ) = Φ∗H = Φ∗L H = 0 dt t dt t t XH since LXH H = dιXH (H) + ιXH dH = ddH(H) + ιXH ◦ ιXH = 0. 19 Just as we have symplectic maps between symplectic vector spaces we also define structure preserving maps between symplectic manifolds. Definition 11 Let (M, ω) and (N, η) be symplectic manifolds of the same di- mension. A smooth map F : M → N is called a symplectic map if F ∗η = ω. If F is also a diffeomorphism, it is called a symplectomorphism. ∗ k k In fact, F η = ω for each k ∈ {1,..., 1/2 dim M}, since the pullback dis- tributes over the wedge product i.e. g∗(α ∧ β) = g∗α ∧ F ∗β. Theorem 11 Let (M, ω, H) be a Hamiltonian system, and let Φ: R × M → M be the flow generated by the Hamiltonian vector field XH . Then for each t ∈ R, Φt is symplectic. ∗ ∗ Proof Φ0 = idM which implies Φ0ω = idM ω = ω. Further using Theorem 9 d Φ∗ω = Φ∗L ω = Φ∗(ι dω + dι ω) = 0 dt t t XH t XH XH since ω is closed and dιXH ω = ddH = 0. This means that the flow of Hamiltonian system not only preserves the sym- k plectic form, but also its exterior products ω for k ∈ {1,..., 1/2 dim M}. In particular it preserves the volume form, thus preserving phase space volume. 20 5 Lie Theory And BCH Formula Lie groups are manifolds with a group structure on them. Such manifolds are homogeneous in the sense that locally the manifold looks the same around any point. Thus we can study the manifold by examining a neighbourhood of the identity element. The tangent space at identity of the Lie group has a canon- ical bracket operation transferred from the space of vector fields on the Lie group which are “compatible” with the group structure. This tangent space and bracket is the Lie algebra of that Lie group. Much of the information about the group is encoded in its Lie algebra which is easier to study as it is a vector space. Some of this encoding of the group’s structure in its Lie algebra can be seen in the BCH formula. In this section, we will describe the basics of Lie groups and Lie algebras before discussing the BCH formula. We will use M to denote a plain manifold, while G will be used for Lie groups. We will use lower case Gothic letters for the Lie algebra of a Lie group, that is we write g for the Lie algebra of the Lie group G. More about Lie theory can be found in books such as [14, Chapter 4] or [8]. Definition 12 A Lie group is a smooth manifold G which is also a group s.t. the group operations µ: G × G → G;(a, b) 7→ ab and ι: G → G; a 7→ a−1 are smooth. Example 5 (GL(n, R) is a Lie group) b By definition, n×n −1 GL(n, R) = { X ∈ R | det{X}= 6 0 } = det (R \{0}). 2 As vector spaces n×n is isomorphic to n , hence we give n×n the topology of 2 R R R Rn . This allows us to investigate the smoothness of the determinant function det: Rn×n → R. n×n Given A ∈ R with entries aij for i, j = 1, . . . , n, let Mij(A) be the (n − 1) × (n − 1) matrix obtained from A by deleting the i-th row and j-th column of A. From linear algebra we know that the determinant of A can be defined in terms of the determinants of these submatrices, i.e. n X i+j det A = (−1) aij det Mij(A), for any i = 1, . . . , n. j=1 From this it is clear that det A is a polynomial in terms of the entries aij of A. Polynomials are smooth functions of their coefficients, thus det is smooth. −1 The set R \{0} is an open subset of R, thus det (R \{0}) is an open subset of Rn×n. Therefore GL(n, R) is an n2-manifold. Pn For A, B ∈ GL(n, R) the entries of AB are given by (AB)ij = k=1 aikbkj. Hence the multiplication map µ: GL(n, R) × GL(n, R) → GL(n, R); (A, B) 7→ AB 21 is smooth since each entry of a AB is a polynomial in terms of aij and bij. Similarly, Cramer’s rule gives that the entries of A−1 are 1 (A−1) = · (−1)i+j det M (A). ij det A ji The determinant is smooth, thus the inverse map −1 ι: GL(n, R) → GL(n, R); A 7→ A is smooth. Recall that the determinant satisfies det(AB) = det(A) det(B) and det A−1 = −1 det(A) . Thus GL(n, R) is closed under µ and ι. Therefore, GL(n, R), with matrix multiplication and inversion is a Lie Group of dimension n2. Example 6 (Sp(2n, R) is a Lie group) The real symplectic group Sp(2n) is given by T Sp(2n) = Sp(2n, R) = { A ∈ GL(2n, R) | A JA = J }, 0 −In where J = In 0 . Let Alt(2n) be the real vector space of skew-symmetric 2 matrices, A = −AT . Note dim Alt(2n) = 2n2 − n, thus Alt(2n) =∼ R2n −n. Consider the map T F : GL(2n, R) → Alt(2n); A 7→ A JA.