<<

’s Laws

I. The equation of motion

We consider the motion of a point under the influence of a gravitational field created by point mass that is fixed at the origin. Newton’s laws give the basic equation of motion for such a system. We denote by q(t) the of the movable point mass at time t, by m the mass of the movable point mass, and by M the mass of the point mass fixed at the origin. By Newton’s law on gravition, the exerted by the fixed mass on the movable mass is in the direction of the vector q(t). It is proportional to the product of the two and the inverse of the square of − the distances between the two masses. The proportionality constant is the G. In formuli q force = G mM − q 3 k k Newton’s second law states that

force = mass = m q¨ × From these two equation one gets q m q¨ = G mM − q 3 k k Dividing by m, q q¨ = µ (1) − q 3 k k with µ = GM. This is the basic equation of motion. It is an ordinary differential equation, so the motion is uniquely determined by initial point and initial . In particular q(t) will always lie in the linear subspace of IR3 spanned by the initial position and initial velocity. This linear subspace will in general have dimension two, and in any case has dimension at most two.

II. Statement of Kepler’s Laws

Johannes Kepler (1571-1630) had stated three laws about planetary motion. We state these laws below. (1643-1727) showed that these laws are consequences(1)

(1) For a discussion of the amount of rigour in Newton’s Principia, see [Pourciau]. In fact, Newton seems to have been more interested in the “direct problem” of finding basic laws consistent with the Kepler laws; see [Brackenridge].

1 of the basic equation of motion (1). We first state Kepler’s laws in an informal way, then discuss a more rigid formulation, and then give various proofs of the fact that (1) implies these laws.

Kepler’s first law: Let q(t) be a maximal solution of the basic equation of motion . Its is either an which has one focal point at the the origin, a branch of a which has one focal point at the origin, a whose focal point is the origin, or an open ray emanating from the origin.

Kepler’s second law (Equal areas in equal times): The area swept out by the vector joining the origin to the point q(t) in a given time is proportional to the time.

Kepler’s third law: The squares of the periods of the planets are proportional to the cubes of their semimajor axes.

We now comment on the terms used in the formulation of Kepler’s laws. The Picard Lindel¨of theorem about existence and uniqueness of solutions applies to the differential equation (1). It implies that for any solution q(t) defined on an interval I, there

is a unique maximal open interval I′ containing I such that the solution can be extended

to I′. Such a solution we call maximal. I′ may be the whole real axis or have boundary

points. If t0 is a boundary point then limt I′, t t0 q(t) = 0 or limt I′, t t0 q(t) = . ∈ → ∈ → | | 2 | | ∞ The orbit of the solution is by definition the set q(t) t I′ in IR . ∈  One of the standard definitions of is the following: Let F and F ′ be two points in the plane. Fix a length, say 2a, which is bigger or equal to the distance between F and

F ′. Then the curve consisting of all points P for which the sum of the distances from P to

F and from P to F ′ is equal to 2a is called an ellipse with focal points F , F ′. a is called the the semimajor axis of the ellipse.

Similarly, fix a length 2a > 0, which is smaller than the distance between F and F ′. Then the curve consisting of all points P for which the absolute value of the difference of the distances from P to F and from P to F ′ is equal to 2a is called a hyperbola with focal points F , F ′.

Finally, fix a point F and a g that does not contain F . Then the curve consisting of all points P for which the distance from P to the point F is equal to the distance from P to the line g is called a parabola with focal point F .

2 Ellipses, and make up the conic sections. There are many other ways to describe conic sections. see the appendix below and the references cited therein.

For two vectors v = (v1, v2) and w = (w1, w2) in the plane, we denote, by abuse of notation, by v w = v w v w × 1 2 − 2 1 the third component of the cross product of v and w. Its absolute value is the double of the area of the triangle spanned by the points 0, v and v + w.

Now let q(t), t (a,b) be a differentiable curve in the plane. Using Riemann sums, ∈ one sees that the area swept out by the vector joining the origin 0 to the point q(t) in the time between between t and t , a

Keplers second law states that there is a proportionality constant L such that t2 q(t) t1 × q˙(t) dt = L(t t ). By the fundamental theorem of calculus, this equivalent tRo saying 2 − 1 that q(t) q˙(t) = L (2) × is constant. Up to a constant depending on the mass of the particle, the quantity q(t) q˙(t) × is the vector of the particle with respect to the origin. Thus, Kepler’s second law is a consequence of the principle of conservation of angular momentum. When the orbit of the solution is an ellipse, we talk of planetary motion. In this case it follows from Kepler’s second law that the motion is periodic. The period is the minimal T > 0 such that q(t + T ) = q(t) for all t IR. The precise form of the third law is, that ∈ 2 2 T 4π a3 = µ

where a is the major semiaxis of the ellipse.

III. Proofs of Kepler’s Laws

The proof of Kepler’s second law is straightforward. By (1)

d µ dt q(t) q˙(t)=q ˙(t) q˙(t) + q(t) q¨(t)=0 q(t) 3 q(t) q(t)=0 × × × − | | × so that q(t) q˙(t) is constant. As discussed in the previous section, this is the content of × Kepler’s second law.

3 Another conserved quantity is the total

1 2 µ E = 2 q˙ q (3) k k − k k Indeed,

d 1 2 µ d 1 µ µ µ q˙ = q˙ q˙ 1 2 =q ˙ q¨ + 3 2 q q˙ =q ˙ q¨ + 3 q =0 dt 2 k k − q dt 2 · − (q q) / · (q q) / · q k k  ·  · k k  by Kepler’s equation (1). The proof of the first law is not as obvious as that of the second law. We give several proofs, always using Kepler’s second law and conservation of energy. Let

L = q q˙ × be the constant of the second law (the angular momentum). We assume from now on that L = 0; otherwise one has motion on a ray emanating from the origin. 6 Polar coordinates

Kepler’s equation is rotation symmetric. Therefore it is a natural idea to use polar coordinates in the plane where the motion of the particle takes place. Without loss of generality we may assume that this is the (q , q ) plane. The polar coordinates r, ϕ of a 1 2 − point are defined by

2 2 q1 = r cos ϕ, r = q1 + q2 q q2 q2 = r sin ϕ, tan ϕ = q1 We first express the absolute value of the angular momentum L and the energy in polar coordinates. Observe that q q˙ = q q˙ q q˙ k × k | 1 2 − 2 1| = rr˙ cos ϕ sin ϕ + r2ϕ˙ cos2 ϕ rr˙ sin ϕ cos ϕ + r2ϕ˙ sin2 ϕ − 2 = r ϕ˙ | | Without loss of generality we assume from now on thatϕ ˙ > 0. So

2 L L = r ϕ,˙ or, equivalently, ϕ˙ = k 2k (4) k k r Also observe that

2 d 2 d 2 q˙ = dt r cos ϕ + dt r sin ϕ k k 2 2 = r˙ cos ϕ rϕ˙ sin ϕ + r˙sin ϕ + rϕ˙ cos ϕ − (5) =r ˙2 cos2 ϕ + r2ϕ˙2 sin2 ϕ + r ˙2 sin2 ϕ + r2ϕ˙2 cos2 ϕ =r ˙2 + r2ϕ˙ 2

4 By (5) and (4) the total energy (3) is

2 1 2 µ 1 2 2 2 µ 1 2 L µ E = q˙ = r˙ + r ϕ˙ = r˙ + k 2k 2 k k − r 2 − r 2 r − r   by (4) so that 2 2 2µ L r˙ =2E + k 2k (6) r − r (6) is a differential equation for r as a function of t. It is difficult to solve explicitly and we do not do this here. Instead we derive a differential equation for the angle ϕ as a function of the radius r. Observe that

dϕ dϕ dt ϕ˙ L dr = dt dr = r˙ = kr2 rk˙ by (4). Inserting (6) gives

dϕ L L 1 = k k = k k 2 = dr 2 2 2 2 2 2µ kLk 2 µ kLk µ 2 e 1 1 r 2E+ 2 r 2E+ 2 r 2 r − r kLk − r − kLk l − r − l q q  q  with 2 2 2E L L e = 1 + k2 k , l = k k (7) q µ µ Therefore

dr 1 l 1 d l 1 ϕ = = dr = − dr 2 2 2 2 2 dr er e Z 2 e 1 1 Z l 1 er Z l 1 − r 2 1 1   l − r − l − er − e − er − e  q  q  q  Thus, with an integration constant ϕ0

(ϕ ϕ ) = arccos 1 l 1 − − 0 e r −  or l r = 1+e cos(ϕ ϕ0) (8) − This is the equation of a with eccentricity e. See Appendix A below. If the energy E is negative, the eccentricity e is smaller than one and we have an ellipse. Similarly, we for E = 0 we have e = 1 and the conic section is a parabola. Finally, if E > 0, we have e > 1 and the conic section is a hyperbola. This ends the proof of Kepler’s first law using polar coordinates. Observe that we used only the equations (4) and (6) of conservation of angular momentum and of energy.

Using the formuli from Appendix A, we see that in the case of ellipses (E < 0) the major axis is l µ a = 1 e2 = 2 E (9) − | | 5 2 2 2 E L since 1 e = | |k2 k *. The minor axis is − µ

2 2 µ 2 E L L L b = a 1 e = | |k2 k = k k = √a k k (10) 2 E µ √2 E √µ p − | | · q | |

3/2 L k k Therefore the area of the ellipse is πab = πa √µ . Now let T be the period of the orbit. By Kepler’s second law the the area swept out after time T is T 1 1 2 q(t) q˙(t) dt = 2 L T Z0 k × k k k On the other hand this is equal to the area of the ellipse. Therefore we get

1 3/2 L L T = πa k k 2 k k √µ or 2π 3/2 T = √µ a This is Kepler’s third law.

The previous proof of Kepler’s first law is based on the conservation laws for energy and angular momentum and on the idea of writing the angle ϕ as a function of the radius 1 r. In a variant of this proof one writes σ = r as a function of ϕ. By the change of variables formula, (4) and (5)

2 2 2 2 2 2 r˙ 2 2 L dr 2 2 2 1 dr 2 1 q˙ =r ˙ + r ϕ˙ =ϕ ˙ ) + r = k 4k ( ) + r = L ( 2 ) + 2 k k ϕ˙ r dϕ k k r dϕ r 2 d 1 2 1    = L ( ) + 2 k k dϕ r r  Then by (3) 2 1 2 µ L dσ 2 2 E = q˙ = k k ( ) + σ µσ 2 k k − q 2 dϕ − k k  Differentiating with respect to ϕ and dividing by L 2 gives k k

2 dσ d σ µ 0 = 2 + σ 2 dϕ dϕ − L k k  dσ Since dϕ = 0 this implies 6 2 d σ µ dϕ2 + σ L 2 =0 − k k

* It is remarkable that a depends only on the energy. So all bounded with the same energy are ellipses with the same length of the major axis, but of course their position in space is ddifferent because the eccentricity then varies with kLk

6 The general solution of this differential equation is

µ σ = 2 1 + e cos(ϕ ϕ ) L − 0 k k  2 1 L with integration constants e, ϕ0. Remembering that r = σ and setting l = k µk as in (7) we get again the equation l r = 1+e cos(ϕ ϕ0) − for the orbit. In this approach e just comes as an integration constant, but now it can easily be identified with the quantity of (7).

The Laplace Lenz Runge vector

The basic equation of motion is a second order differential equation. Its solution is completely detemined by the position q and velocityq ˙ at any given time t. Therefore one expects that the quantities characterizing the orbits can be expressed in terms of q andq ˙. For the eccentricity of the conic section this is done in (7), since E andLare expressed in terms of q andq ˙ in (3) and (2). Similarly, in the case of ellipses, the length of the major and minor axis are described in terms of q andq ˙ by (9) and (10). If one considers the as a problem in three dimension, the plane of the orbit is determined as the plane through the origin perpendicular to the angular momentum vector L. These data determine the Kepler ellipse up to rotation around the origin (the integra-

tion constant ϕ0 of the previous subsection. To fix this ambiguity we would like to find a vector in the direction of the major axis that can be expressed purely in terms of q andq ˙. The standard choice is the Laplace Lenz Runge vector

A = q + 1 q˙ L = 1 q˙ 2 µ q 1 (q q˙)q ˙ = 1 2E + µ q 1 (q q˙)q ˙ (11) − q µ × µ k k − q − µ · µ q − µ · k k k k  k k  Here we used the identity x (y z) = (x y) z +(x z) y for vectors x,y,z IR3 to see × × − · · ∈ thatq ˙ L =q ˙ (q q˙) = (q q˙)˙q + q˙ 2q , and, in the second step, the definition (3) of × × × − · k k L the energy. Observe that, for a vector v in the “invariant plane” orthogonal to L, v L o 1 × k k is the vector in this plane obtained from v by rotating by 90 . So L q˙ L is the vector k k × obtained fromq ˙ by rotating by 90o. First we verify that A really is constant during a Kepler motion. Using the basic equation of motion (1) and the definition of L

dA 1 q q˙ 1 1 q q˙ 1 dt = q q˙ + q· 3 q + µ q¨ L = q q˙ + q· 3 q q 3 q (q q˙)=0 − k k k k × − k k k k − k k × × d 1 d 1/2 3/2 We also used the fact that dt q = dt (q q)− = (q q˙)(q q)− and again the identity k k · − · · x (y z) = (x y) z +(x z) y . × × − · · 7 Next we claim that the length of the Laplace Lenz Runge vector is equal to the eccentricity e. For this reason A is called the in [Cushman]. To prove the statement about the length of A observe that 2 1 2 A A =1 q µ q (˙q L) + µ2 q˙ L · − k k · × k × k Sinceq ˙ and L = q q˙ are perpendicular, q˙ L 2 = q˙ 2 L 2 . By the standard vector × k × k k k k k identity x (y z) = z (x y) · × · × q (˙q L) = L (q q˙) = L L = L 2 (12) · × · × · k k Therefore 2µ 2 1 2 2 A A =1 2 L + 2 q˙ L · − µ q k k µ k k k k k 2k L 2 2µ = 1 + k 2k q˙ 2 µ k k − µ q 2 k k L 2  =1+2 kµ2k E = e by (3).

Before using the Laplace Lenz Runge vector, we describe how one could get the idea to consider it, once one already knows Kepler’s laws*. Let us consider the case of negative energy, so that the orbits are ellipses. A natural invariant of the ellipse is the vector joining the two foci. The origin is one of the ellipse, call the other one f. It is known that for each point q of the ellipse, the lines joining q to the origin and q to the other focus f form opposite equal angles with the tangent line of the ellipse. See Appendix A. Therefore a vector from q in the direction of f is obtained by adding to q twice the orthogonal − projection of q to the tangent direction of the ellipse at q.

q˙ A unit tangent vector to the ellipse at the point q is q˙ . Therefore f q has the q˙ q˙ k k − direction q +2 q q˙ q˙ . This means that there is a scalar function α(t) such that − · k k k k  2q(t) q˙(t) f = q(t) + α(t) q(t) + q˙(t·) 2 q˙(t) (13)  − k k  for all t. Differentiating and using Kepler’s law (1) gives q q˙ (˙q q˙+q q¨) (q q˙)2(˙q q¨) 2q q˙ 0 = f˙ =q ˙ +α ˙ q +2 · 2 q˙ + α q˙ +2 · 2· · 4 · q˙ + · 2 q¨ − q˙ − q˙ − q˙ q˙ k k  h k k k k  k k i 1 2 2 2 q q q˙ q˙ q = αq˙ + q˙ 2 q˙ +2˙α(q q˙) + α q˙ +2 q˙ +2q ( µ q 3 ) 2 q˙· 2 ( µ q· 3 ) q˙ − k k nk k · −k k k k · − k k − k k − k k o q q˙ q   +2α q˙· 2 µ q 3 k k − k k q q˙  1 2 q q˙ 1 2 µ = α˙ +2αµ 2· 3 q + 2 q˙ + 2(q q˙) α˙ +2αµ 2· 3 +2α q˙ q˙ − q˙ q q˙ k k · q˙ q 2 k k − q  k k k k  k k n  k k k k   k k o * For other arguments, see [Heintz] and [Kaplan]. Remarks on the history of this vector can be found in [Goldstein].

8 Since q andq ˙ are linearly independent, this implies that the coefficients of q andq ˙ are both zero. From the coefficient of q we get

q q˙ α˙ +2αµ q˙ 2· q 3 =0 k k k k 1 2 µ Inserting this, and the fact that 2 q˙ q = E into the coefficient ofq ˙ gives k k − k k q˙ 2 +2αE =0 k k 2 q˙ So α = k k . Inserting this into (13) gives − 2E 2 q˙ 2q q˙ f = q k2Ek q + q˙ · 2 q˙ − − k k  2  1 q˙ = E + k k q (q q˙)q ˙ E 2 − ·    (14) 1 2 µ = E q˙ q q (q q˙)q ˙  k k − k k − ·  µ  = E A

µ The argument started with the observation that f should be a conserved quantity. As E is conserved, this suggests that A is a conserved quantity. Above, we have proven this µ directly. Observe from (9), that in the case of ellipses, E is twice the major semiaxis a of | | the ellipse. This is consistent with the fact that the distance between the two foci is 2ea.

Once one knows that the Laplace Lenz Runge vector and the angular momentum vector are constants of the Kepler motion, the proof of Kepler’s first law is relatively fast. As

q A = q q + 1 q˙ L = q + 1 q (˙q L) = q + 1 L (q q˙) = q + 1 L 2 · · − q µ × −k k µ · × −k k µ · × −k k µ k k k k  we have, setting again e = A k k q = e 1 L 2 q A k k eµ k k − · A k k  The expression in brackets is the distance from q to the line perpendicular to A through 2 L the point ke2kµ A. If we call this line the directrix, then the equation above states that the ratio of the distance between q and the origin and the distance between q and the directrix is equal to e. As pointed out in Appendix A, this is one of the characterizing properties of conic sections.

Exercise: In the case of negative energy, show that q f + q is constant! Here, f is k − k k k the vector of (14).

9 Taking the cross product of the Laplace Lenz Runge vector A with the angular mo- mentum vector L gives

1 1 1 1 2 L A = q L ( q) + µ L (˙q L) = q q L + µ L q˙ (15) × k k × − × × k k × k k using the vector identity x (y z) = (x y) z +(x z) y and the fact thatq ˙ and L are × × − · · perpendicular. Therefore

µ q µ µ q q˙ = 2 L A L = 2 L A + 2 L L × − q × L × L × q k k k k  k k k k k k This equation determines the velocity vectorq ˙ of the Kepler motion as a function of the µ q position q. Observe that L 2 L A is independent of t and that q is always a vector k k × µ q k k µ of length one perpendicular to L. So L 2 L q is always a vector of length L in k k × k k k k the plane perpendicular to L. Consequently, for a fixed , the velocity vectors µ µ all lie on the circle around the point L 2 L A of radius L . This circle is called the k k × k k momentum hodograph.

A momentum space argument

In this subsection, we prove Kepler’s first law, starting with the analysis of the hodo- graph* , that is the curve traced out by the momentum vector

p(t)=q ˙(t)

The equation of motion (1) is q p˙ = µ − q 3 k k so that µ 3µq q˙ p¨ = q˙ + ·5 q − q 3 q k k k k Consequently 2 2 µ µ p˙ p¨ = q 6 q q˙ = q 6 L × k k × k k The standard formula for the curvature of a shows that the curvature of the hodograph at the point q is

2 6 p˙ p¨ µ L q L κ = k p×˙ 3k = kq kk6µ3k = kµk k k k k

* Another derivation of the equation of the hodograph, attributed to Hamilton, can be found in [Han- kins], ch.24.

10 This proves that the curvature of the hodograph is constant. So it is a circle. Its radius 1 µ is κ = L . At each of its points p, the vector pointing to the center u of the circle is k k µ perpendicular to L and to the tangent vectorp ˙. Its length is the radius L . So k k µ p˙ L µ p u = L p˙ L = q L 2 q L (16) − k k k k × k k − k kk k × and our argument shows that

µ u = p + q L 2 q L (17) k kk k × ( ) is a conserved quantity ∗ . The hodograph has the equation

µ p u = L k − k k k µ µe So the hodograph is the circle around u of radius L . Observe that u = L where 2 k k k k k k 2E L e = 1 + k2 k as in (7), since q µ 2 2 2 µ 2 µ u = p + 2 4 q L +2 2 p (q L) k k k k q L k × k q L · × k k2 k k k kk k 2 µ µ = p + 2 2 2 L (p q) k k L − q L · × k 2k k kk k 2 µ µ = p + 2 2 k k L − q k2 k k k µ µe 2 =2E + L 2 = L k k k k  Here we used, as in (12), that p (q L) = L (q p) = L 2 R . · × − · × −k k We now describe the position q in terms of the momentum p. By (16) and the fact that q is perpendicular to q L, we have q (p u)=0. So q is perpendicular to (p-u) × · − and therefore it is a multiple of (p u) L. Write − × (p u) L r q = r (p−u)×L = µ (p u) L with r = q (18) k − × k − × ±k k As pointed out in (12), q (p L) = L 2 . If we insert the representation of q above, we · × k k get r (p u) L (p L) = L 2 µ − × · × k k L  o Cross product by L corresponds to rotation by 90 around the axis through the origin k k in direction of L. Therefore this implies µ = r (p u) p − · = r (p u) (p u)+(p u) u (19) 2− · 2 − − · µ µ e (p u) u  = r 2 + 2 − L L p u · u k k k k k − k k k  (∗) µ From (15) one sees that u = 2 L × A, where A is the Laplace Lenz Runge vector of the previous kLk A 1 u L u subsection. Conversely, = µ × . One can prove directly that is a conserved quantity and then derive the Laplace Lenz Runge vector using this formula.

11 µ µe Here we used that p u = L and that u = L . k − k k k k k k k Denote the angle between q and the (constant) vector u L by ϕ. By (18) this is the × angle between (p u) L and u L. Again, as cross product by a fixed vector does not − × × change angles, φ also is the angle between p u and u. Now (19) gives − l r = 1+e cos ϕ

2 L with l = k µk as in (7). So we obtain again the equation of the conic as in Appendix A.

A purely geometric proof of Kepler’s first law using the hodograph was given by Hamilton, Kelvin and Tait, Maxwell, Fano and Feynman independently; see Feynman’s lost lecture [Goodstein]. It first describes the hodograph in geometric terms and then deduces the Kepler orbit as the enveloppe of its tangent lines. This argument is in parts very close to Newton’s original argument. There is a debate whether Newton’s original argument meets the 21st century standards of rigour. For references see the introduction of [Derbes]. [Brackenridge] provides a guide and historical perspective on Newton’s treatment of the Kepler problem.

The

In the proofs of Kepler’s first law given above we derived the shape of the orbit, but did not actually write down a parametrization by the time t. Such a parametrization would correspond to a solution of the basic equation of motion (1). However, (8) is a parametrization of the orbit in terms of the angle ϕ ϕ (also called the , see − 0 Appendix A), and conservation of angular momentum (4) implicitly gives the dependence dϕ L of ϕ on t by the differential equation dt = kr2k . To simplify the discussion we put ϕ0 = 0. We consider the case of negative energy, that is, the case of bounded orbits. Write

2 E = ε − 2 In this case the orbit is described by the equation

2 2 (q1+ea) q2 a2 + b2 =1 where the principal axes a,b of the ellipse and the eccentricity e are determined by (9), (10) and (7) respectively. We use the parametrization

q = a cos s ea , q = b sin s 1 − 2 by the eccentric anomaly s (see Appendix A). To get the dependence of s on the time t, observe that by formula (30) of the appendix and (7)

ds √1 e2 √ 2E ε dϕ = −l r = −L r = L r k k k k 12 dϕ L As observed in (4), dt = kr2k . Therefore, by the chain rule

ds ε dt = r (20)

By (9), ε2 =2 E = √µ . In formula (31) of the appendix, we show that r = a(1 e cos s). | | a − Hence (20) gives ds √µ dt = a3/2(1 e cos s) − or, equivalently 3 2 dt a / (1 e cos s) − ds = √µ Integrating both sides with respect to s gives Kepler’s equation

3/2 t t = a (s e sin s) (21) − 0 √µ − Kepler’s equation is an implicit equation for s as a function of t. It is not a differential equation anymore, but just an equation that involves the inversion of the ”elementary function” s s e sin s . This inversion cannot be performed by elementary functions. 7→ − See, however, Appendix E. One of the advantages of the eccentric anomaly is, that it is well suited for a description of the Kepler motion in position space. Therefore we derive the equation of motion with respect to this parameter. We make the change of variables (20) in the basic equation of motion (1). Recall that r = q . Then k k dq dt q ε dq ds =q ˙ ds = kεk q˙ or, equivalentlyq ˙ = q ds (22) k k Therefore, using (1)

2 2 d q 1 d q q dq˙ 1 d q dq q 1 d q dq µ ds2 = ε kdsk q˙ + kεk ds = q kdsk ds + kε2k q¨ = q kdsk ds ε2 q q k k k k − k k (23) 1 dq dq µ = q 2 (q ds ) ds ε2 q q k k · − k k Once one knows that (20) is a good change of variables for the Kepler problem and that the Laplace Lenz Runge vector is a conserved quantity, one can give a quick proof of Kepler’s first law: By the definition of ε = √ 2E , (11) and (22), the Laplace Lenz Runge vector is − 2 1 2 µ 1 ε 1 dq dq µ A = ε + q (q q˙)q ˙ = 2 q 2 q + q µ − q − µ · − µ q · ds ds − ε q k k   k k  k k  since ε2 = 2E . Therefore (23) gives − 2 d q µ 2 + q = 2 A ds − ε 13 The general solution of this second order inhomogenuous linear differential equation with constant coefficients is q(s) = C cos s + C sin s aA (24) 1 2 − µ with constant vectors C1, C2 and a = ε2 . This is the parametrization of an ellipse. We set A = e. To identify C and C we use the fact that, by (22), k k 1 2 q L = ε q dq = ε C C + aA C sin s aA C cos s k k × ds 1 × 2 × 1 − × 2  Since the functions 1, sin s and cos s are linearly independent over IR, this implies that the vectors C C , A C and A C are all proportional to L. Consequently C , C and 1 × 2 × 1 × 2 1 2 A lie in one plane. We may assume (by replacing s by s s and modifiying C , C ) that − 0 1 2 q(0) points in the direction of A, in other words, that C1 and A are collinear. Then the equation above gives q L = ε (C C aA C cos s) (25) k k 1 × 2 − × 2 By (24) q 2 = C 2 cos2 s + C 2 sin2 s + e2a2 k k k 1k k 2k +2C1 C2 sin s cos s 2aA C1 cos s 2aA C2 sin s · − · − · (26) =( C 2 C 2) cos2 s +( C 2 + e2a2) k 1k −k 2k k 2k +2C C sin s cos s 2aA C cos s 2aA C sin s 1 · 2 − · 1 − · 2 The square of the norm of the right hand side of (25) is a linear combination of 1, cos s and cos2 s. As the functions cos2 s , sin s cos s, sin s, cos s and 1 are linearly independent over IR, the fact that this square of the norm of the right hand side of (25) is equal to q 2 L 2 implies that the coefficients of sin s cos s and sin s in (26) are zero. That is, C k k k k 2 is perpendicular to C1 and A. So

q 2 =( C 2 C 2) cos2 s +( C 2 + e2a2) 2ae C cos s k k k 1k −k 2k k 2k − k 1k Taking the square of the absolute values of both sides of (25) now gives

L 2 ( C 2 C 2) cos2 s +( C 2 + e2a2) 2ea C cos s k k k 1k −k 2k k 2k − k 1k =ε2 C 2 C 2 2ae C C 2 cos s + a2e2 C 2 cos2 s  k 1k k 2k − k 1kk 2k k 2k  L Equating the coefficients of cos s gives C = k k and our identity gives k 2k ε ( C 2 C 2) cos2 s +( C 2 + e2a2) = C 2 + a2e2 cos2 s k 1k −k 2k k 2k k 1k and hence C 2 = C 2 + e2a2 and q = C ea cos s. The equation for the energy, k 1k k 2k k k k 1k − combined with (22) gives 2 2 ε 1 ε dq 2 µ 2 = 2 q 2 ds q − k k k k − k k 14 or equivalently ε2 dq 2 + q 2 =2µ q k ds k k k k k  Inserting we get

ε2 C 2 sin2 s + C 2 cos2 s + C 2 2ea C cos s + e2a2 cos2 s =2µ( C ea cos s) k 1k k 2k k 1k − k 1k k 1k −  or 2 µ 2 C 2ea C cos s =2 2 ( C ea cos s) k 1k − k 1k ε k 1k − µ Therfore C = 2 = a . k 1k ε In the parametrization (24) all orbits have period 2π. This is even true for the orbits with zero angular momentum (in this case C1 and C2 are linearly dependent), in contrast to the true Kepler flow where the point mass crashes into the origin. One says that the eccentric anomaly ”regularizes the collisions”.

IV. The Hamiltonian point of view

Hamiltonian vectorfields

Definition. Let H(q, p) be a differentiable function on IRn IRn. The system of first order × differential equations

∂H ∂H q˙i = , p˙i = , for i =1, ,n ∂pi − ∂qi ··· is called the Hamiltonian system associated to the Hamiltonian function H.

Observe that the Hamiltonian vectorfield does not change when one adds a constant to the Hamiltonian.

Example 1 (Kepler Hamiltonian)

The Hamiltonian system of the Hamiltonian

1 2 µ K(q, p) = 2 p q k k − k k is qi q˙i = pi , p˙i = µ q 3 − k k Differentiating the first equations once and inserting the second gives

qi q¨i = µ q 3 − k k This is the basic equation (1) for the Kepler flow.

15 Example 2 (Harmonic Oscillator)

H(q, p) = 1 p 2 + q 2 2 k k k k  The corresponding Hamiltonian system

q˙ = p , p˙ = q i i i − i gives the second order system

q¨ = q for i =1 n i − i ··· This is the harmonic oscillator. More generally, whenever one has the motion of a particle on IRn under the influence of a potential V (q), the corresponding flow is described by the Hamiltonian

H(q, p) = 1 p 2 + V (q) 2 k k Indeed, the corresponding Hamiltonian system is

q˙ = p , p˙ = V (q) −∇ so that, as above,q ¨ = V (q) . In this description, q is the position variable and p is the −∇ momentum variable. The relation between the Hamiltonian formalism and the Lagrange formalism is given by the Legendre transform, see for example [Arnold] ch.15.

Example 3 (Regularized Kepler Hamiltonian)

Let ε = 0. The system of the Hamiltonian 6 K˜ (q, p) = 1 q ( p 2 + ε2) µ 2ε k k k k − ε is 1 q˙ = ε q p k k (27) 1 q 2 2 1 ˜ µ p˙ = 2ε q ( p + ε ) = q 2 K(q, p) + ε q − k k k k − k k ε  By the first equation, p = q q˙ and k k 1 d q 1 1 q q˙ ε 1 1 ˜ µ q¨ = ε kdtk p + ε q p˙ = ε q· q q˙ ε q q 2 K(q, p) + ε q k k k k k k − k k k k 1 1  = 2 (q q˙)q ˙ 2 εK˜ (q, p) + µ q q · − ε q k k k k  Its restriction to the level set (q, p) K˜ (q, p)=0 is the flow (23) of the Kepler problem  parametrized by the eccentric anomaly. 16 Example 4 (Geodesics)

As in Appendix C, let G(q) = gab(q) a,b=1, ,n be a Riemannian metric on an open subset U of IRn. We claim that the geodesics are··· described by the Hamiltonian

n 1 1 1 ab H(q, p) = 2 p⊤G− (q) p = 2 g (q) papb a,bX=1 Indeed, the associated Hamiltonian system is

1 q˙ = G− p 1 1 ∂G− p˙a = p⊤ p for a =1, ,n − 2 ∂qa ··· 1 1 1 ∂G 1 ∂G− ∂G− 1 ∂G 1 Since G G− = 1l , we have G− + G =0 so that = G− G− . We ∂qa ∂qa ∂qa − ∂qa insert this into the equation above and get the equations

G q˙ = p 1 ∂G 1 ∂G p˙a = (G p)⊤ (G p) = q˙⊤ q˙ for a =1, ,n 2 ∂qa 2 ∂qa ··· Differentiating the first equation gives G q¨ =p ˙ G˙ q˙ and therefore, as in ???? − n (G q¨) =p ˙ g˙ q˙ a a − ab b bP=1 n n 1 ∂gbc ∂gab = q˙bq˙c q˙bq˙c 2 ∂qa − ∂qc b,cP=1 b,cP=1 n 1 ∂gbc ∂gab ∂gca = q˙bq˙c 2 ∂qa − ∂qc − ∂qb  b,cP=1  and therefore the equation n a q¨a + Γbc q˙bq˙c =0 b,cX=1 for geodesics; see equation (37) below.

A vectorfield X on an open subset U of IRm is a map that associates to point x U a ∈ tangent vector X(x). The tangent vector lies in IRm, viewed as tangent space to U in the point x. An integral curve to the X is a differentiable curve t x(t) whose derivative 7→ at each point is given by the vectorfield, that isx ˙(t) = X x(t) for all t. In this sense, ordinary differential equations on U are the same as vectorfields  on U. Assume that the vectorfield has the components X , ,X . That is 1 ··· m X(x) = X (x), X (x) 1 ··· m  17 Then we also write ∂ ∂ X = X1 + + Xm ∂x1 ··· ∂xm The reason for this notation is the following. Let ϕ(x) be any function on U, and t x(t) 7→ an integral curve. Then, by the chain rule

d ∂ϕ ∂ϕ ϕ x(t) = X1 + + Xm dt ∂x1 ··· ∂xm  The resulting function is the directional derivative of ϕ in direction X and is denoted by

X(ϕ) or LX ϕ. If X is a vectorfield with continuous coefficients, then for each point x U there exists ∈ a unique integral curve t Xt(x) with X0(x) = x. The map (t,x) Xt(x) is called 7→ 7→ the flow of the vectorfield. Clearly, Xs+t(x) = Xs Xt(x) . So, if all integral curves are defined for all times, the flow defines an action of the additive group of real numbers on U. For a Hamiltonian function H(q, p), the vectorfield associated to its Hamiltonian sys- tem is the Hamiltonian vectorfield

n n ∂H ∂ ∂H ∂ XH = ∂pi ∂qi − ∂qi ∂pi iP=1 iP=1 The Hamiltonian system is the flow of this vectorfield.

The Poisson Bracket

Definition. The Poisson bracket of two differentiable functions F (q, p), G(q, p) on IRn × IRn is n F, G = ∂F ∂G ∂G ∂F { } ∂qi ∂pi − ∂qi ∂pi iP=1  Obviously In particular, F, G = G, F . We say that F and G are in involution if { } −{ } F, G =0. { }

By the remarks of the previous section, F, G = X (F ) = X (G). In particular, { } G − F F and G are in involution if and only if G is a conserved quantity for the Hamiltonian system with Hamiltonian F . Observe that, for any Hamiltonian H, the Poisson bracket H,H vanishes. So the Hamiltonian is always a conserved quantitiy for its flow. { }

1 2 µ Example 5 As above, let K(q, p) = 2 p q be the Kepler Hamiltonian. We know k k − k k that the the components of the angular momentum vector L(q, p) = q p are conserved × quantities for the associated flow. Therefore they must be in involution with the Hamilto- nian.

18 Example 6 The components of the angular momentum are not in involution among each other. In fact

L ,L = L , L ,L = L , L ,L = L (28) { 1 2} 3 { 2 3} 1 { 3 1} 2 We verify the first equation:

L ,L = q p q p , q p q p { 1 2} 2 3 − 3 2 3 1 − 1 3 = q p , q p + q p , q p q p , q p q p , q p { 2 3 3 1} { 3 2 1 3}−{ 3 2 3 1}−{ 2 3 1 3} = q p + p q = q p q p = L − 2 1 2 1 1 2 − 2 1 3

Example 7 Since the Laplace Lenz Runge vector

A(q, p) = q + 1 p L = 1 p 2 µ q 1 (q p) p − q µ × µ k k − q − µ · | | k k  is a conserved quantity for the Kepler flow, its components are in involution with the Hamiltonian K. On the other hand L ,A = L ,A = L ,A =0 { 1 1} { 2 2} { 3 3} L ,A = A ,L = A { 1 2} { 1 2} 3 L ,A = A ,L = A { 2 3} { 2 3} 1 L ,A = A ,L = A { 3 1} { 3 1} 2 and 2E 2E 2E A ,A = 2 L , A ,A = 2 L , A ,A = 2 L { 1 2} − µ 3 { 2 3} − µ 1 { 3 1} − µ 2

Proof: In the proof we shall use the general identity

F, G G = G F, G + G F, G { 1 2} 1{ 2} 2{ 1} and the preliminary calculations that

L , q = L , q = L , q =0 { 1 1} { 2 2} { 3 3} L , q = q ,L = q , L , q = q ,L = q , L , q = q ,L = q { 1 2} { 1 2} 3 { 2 3} { 2 3} 1 { 3 1} { 3 1} 2 L , p = L , p = L , p =0 { 1 1} { 2 2} { 3 3} L , p = p ,L = p , L , p = p ,L = p , L , p = p ,L = p { 1 2} { 1 2} 3 { 2 3} { 2 3} 1 { 3 1} { 3 1} 2 that for i =1, 2, 3

1 Li, q =0 , (q p), qi = qi , (q p), pi = pi { k k } { · } − { · } 19 and that 2 2 1 1 (q p), p =2 p , (q p), q = q { · k k } k k { · k k } k k Using these identities we get

q1 1 L1,A1 = L1, q + µ (p2L3 p3L2) { } k k − = 1 p L ,L + L L , p p L ,L L L , p µ 2{ 1 3} 3{ 1 2} − 3{ 1 2} − 2{ 1 3} = 1  p L + p L p L + p L =0  µ − 2 2 3 3 − 3 3 2 2  q2 1  L1,A2 = L1, q + µ (p3L1 p1L3) { } k k − q3 1 = q + µ p3 L1,L1 + L1 L1, p3 p1 L1,L3 L3 L1, p1 k k { } { } − { } − { } q3 1   = q + µ p2L1 + p1L2 = A3 k k − q3 1  L1,A3 = L1, q + µ (p1L2 p2L1) { } k k −  q2 1 = q + µ p1 L1,L2 + L2 L1, p1 p2 L1,L1 L1 L1, p2 − k k { } { } − { } − { } = q2 + 1 p L p L = A  − q µ 1 3 − 3 1 − 2 k k   The other Poisson brackets L ,A are obtained by cyclic permutation. Using the second { i j} representation of the Laplace Lenz Runge vector given above, one calculates

2 2 µ 2 µ µ A1,A2 = p q q1 (q p) p1 , p q q2 (q p) p2 { } k k − k k − · k k − k k − ·  2 µ  2 µ  = p q q1 , p q q2 + (q p) p1 , (q p) p2 k k − k k k k − k k · ·  p 2 µ q , (q p) p  (q p) p , p 2 µ q − k k − q 1 · 2 − · 1 k k − q 2  k k   k k  = p 2 µ q ( p 2 µ ) , q + q q , ( p 2 µ ) k k − q 1 k k − q 2 2 1 k k − q k k h  k k  k k i +(q p) p1 (q p) , p2 + p2 p1 , (q p) · h · · i 2 µ  2 µ q1p2 ( p q ) , (q p) ( p q )p2 q1 , (q p) − k k − k k · − k k − k k ·  2  µ (q p)q1 ( p q ), p2 − · k k − k k 2 µ 2 µ p1q2 (q p) , ( p q ) ( p q )p1 (q p) , q2 − · k k − k k − k k − k k ·   2 µ (q p)q2 p1 , ( p q ) − · k k − k k 2 µ 2 2 = p q q1 p , q2 + q2 q1 , p +(q p)[p1p2 p2p1] k k − k k k k k k · − 2  µ 2  µ  µ + q1p2 2 p q ( p q ) p2q1 +(q p)q1 q , p2 k k − k k − k k − k k · k k 2 µ  2 µ  µ p1q2 2 p q +( p q ) p1q2 +(q p)q2 p1 , q − k k − k k k k − k k · k k 2 µ   = p q 2q1p2 +2q2p1 p2q1 + q2p1 k k − k k − − 2 µ µq2 µq2 + 2 p q q1p2 p1q2 (q p)q1 q 3 +(q p)q1 q 3 k k − k k − − · k k · k k = p 2 2 µ [q p q p ] = 2EL − k k − q 1 2 − 2 1 − 3 k k  The other brackets A ,A and A ,A follow by cyclic permutation. { 2 3} { 3 1} 20 The Lie bracket of two vectorfields X = m X ∂ and Y = m Y ∂ is defined i=1 i ∂xi i=1 i ∂xi as P P m m ∂Xi ∂Yi ∂ [X,Y ] = Yj Xj ∂xj − ∂xj ∂xi Xi=1 jP=1  For any function ϕ

2 [X,Y ](ϕ)(x) = ∂ ϕ Xt(Y s(x)) ϕ Y s(Xt(x)) ∂s∂t − h  i s=t=0

See [Arnold], ch. 39. The Hamiltonian vectorfield associated to the Poisson bracket of two functions F and G is the Lie bracket of the Hamiltonian vectorfields associated to F and G respectively. That is

X F,G =[XF ,XG] { } Indeed,

n n 2 2 2 2 ∂G ∂ F ∂F ∂ G ∂G ∂ F ∂F ∂ G ∂ [XF ,XG] = + ∂pj ∂pi∂qj − ∂pj ∂pi∂qj − ∂qj ∂pi∂pj ∂qj ∂pi∂pj ∂qi Xi=1 jP=1  n n 2 2 2 2 + ∂G ∂ F ∂F ∂ G ∂G ∂ F + ∂F ∂ G ∂ ∂qj ∂qi∂pj − ∂qj ∂qi∂pj − ∂pj ∂qi∂qj ∂pj ∂qi∂pj ∂pi Xi=1 jP=1  n n n n = ∂ ∂F ∂G ∂G ∂F ∂ ∂ ∂F ∂G ∂G ∂F ∂ ∂pi ∂qj ∂pj − ∂qj ∂pj ∂qi − ∂qi ∂qj ∂pj − ∂qj ∂pj ∂pi iP=1 h jP=1 i iP=1 h jP=1 i = X F,G { } Finally, we remark that both the Poisson bracket and the Lie bracket fulfil the Jacobi identity, that is F, G, H + G H, G + H, F, G =0 { } { } { }  X, [Y,Z ] + Y, [Z,X ] + Z, [X,Y ] =0       For C2 functions F,G,H and C2 vector fields X,Y,Z

Infinitesimal Symmetries

Whenever q(t) is a solution of the basic equation (1) for the Kepler problem and R SO(3) is a rotation around an axis through the origin, then R q(t) is again a solution ∈ of (1). In other words, the Kepler problem has SO(3)–symmetry. This is reflected by the fact that the Kepler Hamiltonian is SO(3)–invariant:

K(Rq, Rp) = K(q, p) forall R SO(3) ∈ In particular, the Kepler Hamiltonian K is a conserved quantity for the flows

t, (q, p) R (t)q, R (t)p 7→ i i   21 th associated to the rotation Ri(t) around the i axis (i =1, 2, 3) with angle t. The associated vectorfields Xi are given by

d d q˙ = dt Ri(t) q t=0 = Ei q , p˙ = dt Ri(t) q t=0 = Ei p

dRi(t) where Ei = dt t=0 . See Appendix F. This construction is compatible with the Lie brackets defined on the Lie algebra sO(3) (see appendix F) and for vectorfields. For example, [X1,X2] is the vectorfield, which, at the point (q, p), takes the value

2 ∂ R (s)R (t) q, R (s)R (t) p R (t)R (s)q,R (t)R (s) p ∂s∂t 1 2 1 2 − 2 1 2 1 h  i s=t=0

= ∂ E R (t) q, E R (t) p E R (t)q ,E R (s) p ∂t 1 2 1 2 − 2 1 2 1 h   t=0 = (E E E E ) q, (E E E E ) p 1 2 − 2 1 1 2 − 2 1  = [E1,E2] q, [E1,E2] p  = X3(q, p)

Similarly, [X2,X3] = X1 and [X3,X1] = X2 . Since the Kepler Hamiltonian K is a conserved quantity for the flows of the vectorfields X1,X2,X3, we also have

[Xi,XK]=0 for i =1, 2, 3

In fact, the vectorfields Xi are already known to us. They are the vectorfields associ- ated to the components Li of the angular momentum:

Xi = XLi for i =1, 2, 3

This is trivial to verify. 2 In the case of negative energy E = ε , by Example 7 − 2 2 µ µ µ µ L A , L A = L ,L L ,A + A ,L + 2 A ,A 1 ± ε 1 2 ± ε 2 { 1 2} ± ε { 1 2} { 1 2} ε { 1 2} n o 2   µ 2E µ µ = 1 + 2 2 L 2 A =2 L A ε − µ 3 ± ε 3 3 ± ε 3    i Thus, for each fixed choice of , the Poisson brackets of the functions ± 1 µ B± = L A i √2 i ± ε i  i fulfil the same relation as the generators E1,E2,E3 of the Lie algebra sO(3). Precisely,

B±,B± = B± , B±,B± = B± , B±,B± = B± { 1 2 } 3 { 2 3 } 1 { 3 1 } 2 22 Also, they are in involution with the Kepler Hamiltonian K. Furthermore

+ B ,B− =0 for i, j =1, 2, 3 { i j } For the planar problem the relations

µ µ µ µ µ µ ε A1 , ε A2 = L3 , ε A2 , L3 = ε A1 , L3 , ε A1 = ε A2    suggest an SO(3) symmetry.

Explicit Symmetries of the two dimensional Kepler problem

Let K˜ be the Hamiltonian of the regularized Kepler problem of Example 3. We consider the case of dimension two and think of q and p as complex variables. We restrict ourselves to the case of negative energy. By scaling, we can assume that ε = 1. Then

K˜ (q, p) = 1 q ( p 2 + 1) µ 2 | | | | − a b For an invertible complex 2 2 matrix A = and (q, p) C C set ×  c d  ∈ ×

ap + b A (q, p) = (¯cp¯ + d¯)2 q , ·  · cp + d  whenever it is defined. Observe that the second component is the standard action of GL(2, C) on the complex plane by fractional linear transformations. If det A = 1, the factor (¯cp¯ + d¯)2 is the complex conjugate of the inverse of the derivative of the fractional linear transformation p ap+b . Therefore, for A ,A SL(2, C) 7→ cp+d 1 2 ∈ (A A ) (q, p) = A A (q, p) 1 2 · 1 · 2 ·  whenever defined.

Theorem 8 Let (q, p) C C. Then for all A SU(2) ∈ × ∈ (i) K˜ A (q, p) = K˜ (q, p) , whenever defined. · (ii) If (q(t), p(t) solves the Kepler equations (27) with respect to the eccentric anomaly, that is if

1 q 2 1 q˙ = q p , p˙ = ( p + 1) == 2 K˜ (q, p) + µ q | | − 2 q | | − q | | | |  then (P (t),Q(t)) = A (p(t), q(t)) also fulfils (27). · 23 a b Proof: Let A = SU(2) .  c d  ∈ (i)

2 K˜ A (q, p) + µ = 1 c¯p¯ + d¯ 2 q ap+b +1 = 1 q ap + b 2 + cp + d 2 · 2 | | | | cp+d 2 | | | | | |    = q ( p 2 + p 2 ) | | | 1| | 2| where p p 1 = A  p2  ·  1  As A SU(2), p 2 + p 2 = p 2 +1. ∈ | 1| | 2| | | (ii) By part (i)

˙ 1 ˜ 1 1 ˜ ¯ 2 P + Q 2 K(Q,P ) + µ Q = (cp+d)2 p˙ + c¯p¯+d¯ 4 q 2 K(q, p) + µ (¯cp¯ + d ) q | | | | | | |  1 1 ˜  = (cp+d)2 p˙ + q 2 (K(q, p) + µ q =0 | |   Q˙ Q P = 2¯c(¯cp¯ + d¯)p¯˙ q + (¯cp¯ + d¯)2q˙ c¯p¯ + d¯ 2 q ap+b −| | −| | | | cp+d = (¯cp¯ + d¯) 2¯c p¯˙ q +(¯cp¯ + d¯)˙q q (ap + b) −| | ¯ q¯ 2  = (¯cp¯ + d) b q ( p + 1) q +( bp¯ + a) q p q (ap + b) | | | | − | | −| | =0 

since c = ¯b and d =¯a. −

The SU(2) action described above preserves the symplectic form Re dp¯ dq . This ∧ can be used to deduce the second part of the theorem from its first part.

24 Appendix A: Conic sections

Conic sections are the nonsingular curves that are obtained by intersecting a quadratic cone with a plane. The relation with the focal description given after the statement of Kepler’s laws can be seen using the ”Dandelin spheres”, see [Kn¨orrer] 4.7.3. Also, conic sections are the zero sets of quadratic polynomials in two variables that do not contain a line. We discuss the most relevant properties of the conic sections

Ellipses

Consider the ellipse consisting of all points P for which the sum of the distances

P F and P F ′ is equal to 2a. The eccentricity e of the ellipse is defined by k − k k − k

F F ′ =2 ea k − k Clearly, 0 e < 1. Let M be the midpoint between the two foci (the center of the ellipse). ≤ The line through the two foci is called the major axis of the ellipse. The two points of the ellipse on the major axis have each distance a from the center. The foci both have distance ea from the center. The line through M perpendicular to the major axis is called the minor axis. It is the perpendicular bisector of the two foci. The points of the ellipse on the minor axis have distance a from both foci. It follows from Pythagoras’ theorem that the distance of these points from the center is

b = a 1 e2 p − If one introduces Cartesian coordinates x,y centered at M with the major axis as x–axis and the minor axis as y–axis then the equation of the ellipse is

2 2 x y a2 + b2 =1 (29)

For a proof see [Kn¨orrer], Satz 4.3. The area enclosed by the ellipsoid is equal to πab; see for example [Zorich], 6.4.3, Example 5. An important consequence of the focal description of the ellipse is the following: For each point q of the ellipse, the lines joining q to the origin and q to the other focus f form opposite equal angles with the tangent line of the ellipse. For a proof, see [Kn¨orrer], Satz 4.6. Another useful description of the ellipse is the following. The line perpendicular to 1 the major axis that has distance e a to the center and lies on the same side of the center as the focus F is called the directrix with respect to F . One can show that the ellipse is the set of points P for which the ratio of the distance from P to F to the distance from

25 P to the directrix is equal to the eccentricity e. To see this, note that a point (x,y) fulfils the directrix condition with focus (ea, 0) if and only if

2 2 2 2 2 a 2 x y (x ea) + y = e (x e ) a2 + (1 e2)a2 =1 − − ⇐⇒ − ( ) This description directly gives the equation of the ellipse in polar coordinates ∗ (r, ϕ) centered at the focus F . Choose the angular variable ϕ such that ϕ = 0 corresponds to ( ) the ray starting at F in the direction away from the center ∗∗ . If a point P has polar coordinates (r, ϕ) then its distance to F is r. Since the distance from F to the directrix is 1 e a, the distance from P to the directrix is 1 e a r cos ϕ . Thus the equation e − e − − of the ellipse is  r = e 1 e a r cos ϕ e − −    or, equivalently l r = 1+e cos ϕ (30) where l = (1 e2)a. The quantity l is called the parameter of the ellipse. The angle ϕ − is called the true anomaly of a point on the ellipse with respect to the focus F . Observe that ϕ = 0 corresponds to the point of the ellipse closest to F . In , this point is called the perihelion. Formula (30) is a parametrization of the ellipse, giving the distance r to the focus as a function of the true anomaly ϕ. Equation (29) suggests the parametrization

x = a cos s, y = b sin s

of the ellipse. The parameter s is called the eccentric anomaly. To get the relation between the eccentric amnomaly and the true anomaly with respect to the focus F = (ea, 0) , let (x,y) be a point of the ellipse with true anomaly ϕ and eccentric anomaly s. Then

x = a cos s y = b sin s = a 1 e2 sin s p −

and 2 l (1 e )a x = r cos ϕ + ea = 1+e cos ϕ cos ϕ + ea = 1+−e cos ϕ cos ϕ + ea 2 (1 e )a y = r sin ϕ = 1+−e cos ϕ sin ϕ Comparing these two representation, we obtain

2 1 e √1 e2 cos s = e + 1+e−cos ϕ cos ϕ , sin s = 1+e−cos ϕ sin ϕ

(∗) For another derivation, see [Kn¨orrer] 4.7.1 (∗∗) This is the direction from the focus F to the closest point on the ellipse.

26 We differentiate the first equation with respect to ϕ and insert the second to get

2 2 ds 1 e 1 e sin s = − sin ϕ + − 2 e sin ϕ cos ϕ − dϕ − 1+e cos ϕ (1+e cos ϕ) 2 2 e cos ϕ = 1 e sin s + 1 e 1+e cos ϕ sin s −p − p − 2 1 = 1 e sin s 1+e cos ϕ −p − Dividing by sin s and using (30) gives

ds √1 e2 dϕ = −l r (31)

We also need the expression of r in terms of s. Since

r2 =(x ea)2 + y2 = a2(cos s e)2 + (1 e2)a2 sin2 s − − − = a2 cos2 s 2e cos s + e2 + (1 e2) cos2 s + e2 cos2 s − − − = a2(1 e cos s)2  − we have r = a (1 e cos s) (32) − Hyperboli

Consider now the hyperbola consisting of all points P for which the difference of the

distances P F and P F ′ has absolute value equal to 2a. The eccentricity e of the k − k k − k hyperbola is defined by

F F ′ =2 ea k − k Clearly, e > 1. As before, let M be the midpoint between the two foci (called the center). The line through the two foci is called the major axis of the hyperbola. The two points of the hyperbola on the major axis have each distance a from the center. The foci both have distance ea from the center. If one introduces Cartesian coordinates x,y centered at M with the major axis as x–axis then the equation of the ellipse is

2 2 x y 2 2 =1 a − b

where b = a √e2 1. − The description using a directrix is almost identical to the one for ellipses. The line 1 perpendicular to the major axis that has distance e a to the center and lies on the same side of the center as the focus F is called the directrix with respect to F . The hyperbola is the set of points P for which the ratio of the distance from P to F to the distance from P

27 to the directrix is equal to the eccentricity e. In polar coordinates centered at F one gets as equation for the hyperbola 2 (e 1)a r = 1+e−cos ϕ (33) Here, the angular variable ϕ has been chosen such that ϕ = 0 corresponds to the ray ( ) starting at F in the direction to the center ∗ .

Paraboli

Finally consider the parabola consisting of all points that have equal distance form the focus F and the line g (which we call the directrix of the parabola Let l be the distance from F to g. Furthermore let ℓ be the line perpendicular to g through F and M the midpoint of its segment between F and g. It has distance l/2 both from F and g.

FIGURE

If one introduces Cartesian coordinates x,y centered at M with the x–axis being ℓ oriented in direction of M then the equation of the parabola is

y2 +2lx =0

Again we choose polar coordinates centered at F such that ϕ = 0 corresponds to the ray from F to M. As before one sees that the equation of the parabola is

l r = 1+cos ϕ (34)

Appendix B: The two body problem

The basic equation of motion (1) also governs the general two body problem. Here we consider two point masses with masses m1, m2 and time dependent positions r1(t),r2(t). In this situation, Newton’s laws give

r2(t) r1(t) − 3 m1 r¨1(t) = G m1m2 r2(t) r1(t) | − | (35) r1(t) r2(t) − 3 m2 r¨2(t) = G m1m2 r1(t) r2(t) | − | 1 Denote by R(t) = m1+m2 m1r1(t) + m2r2(t) the center of . Adding the two equations of (35) gives R¨(t) = 0. That is, the center of gravity moves with constant speed. Also, set q(t) = r (t) r (t). (35) gives 2 − 1 q(t) q(t) r¨1(t) = Gm2 q(t) 3 , r¨2(t) = Gm1 q(t) 3 | | − | |

(∗) Again, this is the direction from the focus F to the closest point on the ellipse.

28 so that q(t) q¨(t) = µ q(t) 3 − | |

with µ = G(m1 + m2). This is the basic equation (1).

Appendix C: The Lagrangian formalism

The Euler Lagrange equations associated to a function (t,q,p) is L d ∂ ∂ L L =0 (36) dt ∂pi p=˙q − ∂qi p=˙q 

Conventionally one views as a function of the variables t,q, q˙ and uses the shorthand L formulation d ∂ ∂ L L =0 dt ∂q˙i − ∂qi  The Lagrange function for the Kepler problem is the difference between the cinetic and the 1 2 µ K(q, p) = 2 p + q L k k k k It is independent of t. Then

∂ K d ∂ K ∂ K qi L = pi , L =q ¨i , L = µ 3 ∂pi dt ∂pi p=˙q ∂qi − q  k k so that the associated Euler Lagrange equation are

qi q¨i + µ q 3 =0 k k This is the basic equation of motion (1). The Euler Lagrange are related to the following variational problem. Fix points 3 q′, q′′ IR and times t′

29 Another instance of a variational problem is the construction of geodesics. Let U be an open subset of IRn. Assume that for each point q U one is given a positive definite ∈ symmetric bilinear form on IRn (a Riemannian metric). It is represented by a positive

symmetric n n matrix G(q) = gab(q) a,b=1, ,n . For simplicity we assume that the × ··· coefficients g (q) are C∞ functions of q. The length of a curve γ :[t′,t′′] U, t q(t) ab → 7→ with respect to the Riemannian metric is by definition

t′′ 1 2 length(γ) = q˙(t)⊤G q(t) q˙(t) dt Z ′ t    By definition geodesics are “locally shortest connections”. That is, a curve t q(t) is a 7→ geodesic if and only if, for each t, there is ε > 0 such that for all s with t

n 1 2 (q, p) = p G(q) p = g (q) p p L ⊤ ab a b q  a,bX=1 

Obviously the length of a curve does not change under reparametrization. In particular, the critical point for the variational problem is degenerate. To normalize the situation, we look for minimizing curves that are parametrized with constant speed (different from zero).

That is, curves which are parametrized in such a way thatq ˙(t)⊤G q(t) q˙(t) = const for all t. If one has a minimizer for the variational problem that is parametrized  by arclength, then it also fulfils the equation

2 2 d ∂ ∂ L L =0 (37) dt ∂pi p=˙q − ∂qi p=˙q 

Indeed, multiplying the Euler Lagrange equation (36) by const = (q, q˙), we get L

d ∂ ∂ const (q, q˙) L (q, q˙) L =0 dt ∂pi ∂qi p=˙q h L p=˙q −L i  which implies (37). Equation (37) is simpler than the Euler Lagrange equations associated to the original Lagrange function. We now derive the equation of motion associated to (37). To see that we really get geodesics, we then have to verify that the resulting curves are indeed parametrized with constant speed. Equation (37) gives

n n d ∂gbc 2 gab(q)q ˙b q˙bq˙c =0 for a =1, ,n dt − ∂qa ···  bP=1  b,cP=1 30 Since d g (q) = n ∂gab q˙ , (37) is equivalent to dt ab c=1 ∂qc c P n n n 1 ∂gbc ∂gab gab q¨b = q˙bq˙c q˙bq˙c 2 ∂qa − ∂qc bP=1 b,cP=1 b,cP=1 n 1 ∂gbc ∂gab ∂gca = q˙bq˙c 2 ∂qa − ∂qc − ∂qb  b,cP=1 

for a =1, ,n. Here we used that ∂gab = ∂gba and exchanged the summation indices b ··· ∂qc ∂qc and c. The equations above state that the a–component of the vector G q¨ is equal to the 1 n ∂gbc ∂gab ∂gca ab vector with entries q˙bq˙c . We denote by g the entries 2 b,c=1 ∂qa − ∂qc − ∂qb  1  of the inverse matrix G−P. Then the equations are equivalent to

n a q¨a + Γbc q˙bq˙c =0 (38) b,cX=1

a where Γbc are the Christoffel symbols

n n a 1 ad ∂gdb ∂gcd ∂gbc Γbc = g + 2 ∂qc ∂qb − ∂qd Xd=1 b,cP=1  To verify that (38) really describes geodesics, we still have to verify that solutions of (38) are curves that are parametrized by constant speed. So let t q(t) be a solution of 7→ (38). Reversing the calculation above, we see that shows 2 G q¨ is the vector with entries n ∂gbc ∂gab ∂gca q˙bq˙c . Therefore b,c=1 ∂qa − ∂qc − ∂qb P  d q˙ G(q)q ˙ =2˙q G(q)¨q +q ˙ G˙ q˙ dt n n ∂gbc ∂gab ∂gca ∂gbc = q˙aq˙bq˙c + q˙aq˙bq˙c =0 ∂qa − ∂qc − ∂qb ∂qa a,b,cP=1  a,b,cP=1 andq ˙ G(q)q ˙ is indeed constant.

Appendix E: The Kepler equation

The Kepler equation (21) τ = s e sin s (39) − √µ where τ = 3 2 (t t ) is the “” can be solved for s as a function of τ (and a / − 0 thus of the time t) using Fourier series and Bessel functions. Since e < 1, the right hand side is a strictly monotonically increasing function of s, and for s = nπ, n ZZ one has ∈ 31 τ = s. Therefore we can write s as a C∞ function s(τ) of τ, and sin s(τ) is an odd, 2π periodic function of τ. It has a Fourier series

∞ sin s(τ) = an sin nτ nP=1

with Fourier coefficients π 2 an = π (sin s(τ))(sin nτ) dτ Z0 Partial integration gives

π 2 τ=π 2 d an = (sin s(τ))(cos nτ) + sin s(τ) (cos nτ) dτ − nπ τ=0 nπ Z dτ 0 

The first term vanishes. For the second term, observe that by (39)

d sin s(τ) = 1 d s(τ) τ = 1 ds 1 dτ e dτ − e dτ −   π Since 0 (cos nτ) dτ = 0, we get by the substitution rule and (39) R π π π 2 ds 2 2 an = (cos nτ) dτ = (cos nτ(s)) ds = cos n(s e sin s) ds (40) neπ Z dτ neπ Z neπ Z − 0 0 0  The nth Bessel function is defined as π 1 Jn(x) = π cos(x sin t nt) dt Z0 − See [Watson]. So

π 2 2 an = cos (ne) sin s ns ds = Jn(ne) neπ Z − ne 0  Consequently we get from (39) the “Kapteyn” series

∞ 2 s = τ + n Jn(ne) sin nτ nP=1

Appendix F: The group SO(3) and its Lie algebra sO(3)

By definition, SO(3) is the group of all real 3 3 matrices R with determinant 1 for × 3 which R⊤R = 1l. One can show that it consists of all rotations around an axis in IR around an axis through the origin. It is generated by the one parameter subgroups

10 0 cos t 0 sin t cos t sin t 0 − R1(t)= 0 cos t sin t  ,R2(t)= 0 1 0  ,R3(t)= sin t cos t 0  0 sin t −cos t sin t 0 cos t 0 0 1    −    32 of rotations around the x1, x2 and x3 axis, respectively. d The Lie algebra sO(3) is the set of all derivatives dt R(t) of differentiable curves t=0 in SO(3) with R(0) = 1l. One can show that it consists of all real skew symmetric 3 3 × matrices. A basis for the vectorspace sO(3) consists of the matrices

00 0 d E1 = dt R1(t) =  0 0 1  t=0 − 01 0   0 01 d E2 = dt R2(t) =  0 00  t=0 1 0 0  −  0 1 0 d − E3 = dt R3(t) =  1 0 0  t=0 0 0 0   1 Whenever R SO(3) and Y sO(3) then RYR− lies again in sO(3). This defines an ∈ ∈ action of the group SO(3) on its Lie algebra sO(3), called the adjoint representation. If d X = dt R(t) is an element of the Lie algebra and Y another element of the Lie algebra t=0 then

d 1 [X,Y ] = dt R(t) YR(t)− = XY YX t=0 − lies again in the Lie algebra. [X,Y ] is called the Lie bracket in the Lie algebra sO(3). It fulfils [X,Y ] = [Y,X] and the Jacobi identity −

X, [Y,Z] + Y, [Z,X] + Z, [X,Y ] =0       One easily verifies that

[E1,E2] = E3 , [E2,E3] = E1 , [E2,E1] = E2

Appendix G: Reparametrization

In Example 3, we reparametrized the Kepler flow using the elliptic anomaly as a new independent parameter. This reparametrization depends on the energy E of the system. In general, let H(q, p) be any Hamiltonian and fix an energy E. Furthermore, let f(q, p) be any differentiable function. As observed above, the Hamiltonian vectorfields associated to the Hamiltonians H(q, p) and H(q, p) E agree. Set −

H˜ (q, p) = f(q, p) H(q, p) E −  33 On the level set of H to the energy E

1 H− (E) = (q, p) H(q, p) = E 

the Hamiltonian vectorfield for H˜ is

dqi = f(q, p) ∂H , dpi = f(q, p) ∂H , ds ∂pi ds − ∂qi 1 So, on H− (E), the Hamiltonian vectorfield for H˜ is obtained from the Hamiltonian vec- torfield for H by the reparametrization

dt ds 1 ds = f(q, p) , or, equivalently dt = f(q,p)

2 ε In the case of negative energy E = 2 , the regularized Kepler flow is obtained from − dt q the standard Kepler flow by the reparametrization ds = kεk . So it is described by the Hamiltonian 2 2 q ε q 1 2 µ ε k k K(q, p) + = k k p + = K˜ (q, p) ε 2 ε 2 k k − q 2  k k  as in Example 3.

Appendix H: Hamiltonian flows on the sphere

We denote by

1/2 Sn = x =(x ,x , ,x ) IRn+1 x = x2 + x2 + + + x2 =1 0 1 ··· n ∈ k k 0 1 ··· n    the n-dimensional sphere. The tangent hyperplane to Sn in x Sn is ∈ T Sn = y IRn+1 x y =0 x ∈ ·  and the total tangent space of the sphere is

T Sn = (,y) IRn+1 IRn+1 x =1, x y =0 ∈ × k k · 

Given a differentiable function H(x,y) on T Sn, the associated Hamiltonian system on T Sn is n n ∂H ∂H ∂H ∂H ∂H x˙ i = xj xi , y˙i = + xj yj xi ∂yi − ∂yj − ∂xi ∂xj − ∂yj  jP=1   jP=1  n ∂H n ∂H ∂H The terms j=1 xj ∂y xi and j=1 xj ∂x yj ∂y xi are chosen such that the flow  j   j − j  stays on the tangentP space of the sphere,P that is, that for an integral curve t x(t),y(t) 7→  d x(t) = 0 and d x(t) y(t)=0 dt k k dt ·

34 Example. The flow associated to the Hamiltonian H(x,y) = 1 y is 2 k k

1 x˙ i = y yi , y˙i = y xi k k −k k It describes the uniform flow of the point x on big circles, that is the geodesic flow on the sphere.

Hamiltonian systems can be defined on much more general spaces, for example on ”symplectic manifolds” or ”Poisson manifolds”. See ????

35