An analysis of block Verlet timestepping and lagged force evaluations for the gravitational N−body problem

by

Bimali Jayasinghe, PhD

A Dissertation

In

Mathematics and Statistics

Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for the Degree of

DOCTOR OF PHILOSOPHY

Approved

Dr. Katharine Long Chair of Committee

Dr. Victoria Howle

Dr. Giorgia Bornia

Dr. Joshua Padgett

Dr. Sanjaya Senadheera

Dr. Mark Sheridan Dean of the Graduate School

August, 2018 Texas Tech University, Bimali Jayasinghe, August 2018

ACKNOWLEDGEMENTS

I would like to acknowledge and thank the following people who have supported me throughout my PhD program. First, and foremost I would like to thank my advisor, Dr.Katharine Long, for guiding me through the PhD project with her methodic work and enthusiasm. I also thank her for correcting the draft of thesis with great patience and for making useful comments on the language and style. I would like to express my gratitude towards committee members, Dr. Victoria Howle, Dr. Giorgia Bornia, Dr. Joshua Padgett, for their valuable comments and suggetions. I am extremely thankful to the Department of Mathematics and Statistics in Texas Tech University for giving me an opportunity to pursue my graduate studies. Last but not least, I would like to extend my sincere thanks to my family and my friends for their continuous love, support and caring.

ii Texas Tech University, Bimali Jayasinghe, August 2018

TABLE OF CONTENTS

ACKNOWLEDGEMENTS...... ii ABSTRACT...... iv LIST OF TABLES...... v LIST OF FIGURES...... vi 1. INTRODUCTION...... 1 1.1 Gravitational N− body problem...... 1 1.2 Numerical Methods in N Body Simulation...... 4 2. NUMERICAL ORBIT INTEGRATION...... 7 2.1 Hamiltonian and N− body problem...... 7 2.2 Symplectic Integrators...... 9 2.2.1 Modified(or Symplectic) Euler Integrator...... 9 2.2.2 The Stormer-Verlet Integrator...... 11 2.2.3 Interpretation as splitting method...... 12 2.3 Individual timesteps...... 14 3. COMPUTATION OF POTENTIAL...... 16 3.1 Motion Under Gravity...... 16 3.2 Potentials of Spherical Systems...... 17 3.3 Potential-density pairs for other systems...... 22 3.4 Potential from Functional Expansions...... 23 3.5 Poisson solvers for N− body code...... 25 3.5.1 Direct Summation...... 25 3.5.2 Tree Code...... 26 3.5.3 Other Poisson Solvers...... 27 4. DETAILED ANALYSIS OF BLOCK TIMESTEP Verlet WITH LAGGED EXPANSION COEFFICIENTS...... 28 4.1 Careful description of Algorithms...... 29 4.2 Error Analysis with Exact Forces...... 33 4.3 Error Analysis with lagged Forces...... 43 5. NUMERICAL EXPERIMENTS...... 48 5.1 Generation of Samples from DF...... 48

iii Texas Tech University, Bimali Jayasinghe, August 2018

5.1.1 Position Distribution...... 49 5.1.2 Velocity Distribution...... 51 5.2 Experiments...... 53 6. CONCLUSION...... 64 6.1 Future Work...... 65 BIBLIOGRAPHY...... 66

iv Texas Tech University, Bimali Jayasinghe, August 2018

ABSTRACT

N-body simulations are widely used in astrophysics to model the behavior of stellar system. The Verlet integrator is one of the principle numerical methods used to solve the N−body problem. It is famous because of its significant features such as good numerical stability, time reversibility, preservation of the symplectic form on phase space and economy of memory. Since it is symplectic, it exactly solves an approximate Hamiltonian. The Verlet integrator preserves certain conserved quantities exactly, such as the total angular and linear momentum and the phase-space volume. Since the density in a stellar system varies with distance from its center, it is extremely inefficient to integrate all the stars with the same timestep, so an integrator should allow individual timestep for each star. The block timestep scheme will help to do this. We have analyzed both Verlet and block timestep Verlet method. We proved that block timestep Verlet method has GE, O(h2). The block timestep Verlet method has same advantages as in Verlet method except the symplecticity. We have seen that the block timestep Verlet method more efficient than regular Verlet method. We carry out experiments to check our analysis. When developing N− body simulation, the force calculation is one of the important factors. The best numerical integrator for a problem is evaluate the force cheaply. We introduced a method called lagged calculation using the functional expansion method.

v Texas Tech University, Bimali Jayasinghe, August 2018

LIST OF TABLES

4.1 Algorithm of BTS Verlet method...... 32 5.1 Algorithm of accept/reject method...... 52 5.2 Algorithm of ...... 54 5.3 Number of force evaluations...... 63

vi Texas Tech University, Bimali Jayasinghe, August 2018

LIST OF FIGURES

4.1 The block timestep scheme, for a system with 4 classes of particles.. 31 5.1 Plummer sphere for 1000 stars 2D and 3D plot...... 51 5.2 Orbit of a single star...... 55 5.3 Angular momentum of a single star...... 56 5.4 Total Energy of a single star...... 57 5.5 The order of the Global Error I...... 58 5.6 The order of the Global Error II...... 59 5.7 The Total Energy of many stars...... 60 5.8 The Total Energy and Angular Momentum of many stars...... 61 5.9 The GE of BTS Verlet method with several timestep...... 62

vii Texas Tech University, Bimali Jayasinghe, August 2018

CHAPTER 1 INTRODUCTION

The N-body problem is the problem of predicting the motion of system of particles that are interacting with each other by a physical force such as gravity or electro- magnetism. The equation of motion can be classical or quantum mechanical. Some of these various N-body problems are solvable exactly for any value of N: for exam- ple, both the classical and quantum N-body problems with harmonic oscillator forces have exact solutions. When the “particles” are vortices interacting via fluid forces, the problem is exactly soluble when N ≤ 3. Usually, the N-body problem means the problem of N classical particles interacting via gravity; with N ≤ 2 this problem has well-known solutions. With N = 3, there are several exact solutions for periodic orbits discovered by Euler (1767), Lagrange (1772) and Poincar`e(1892-1899). An exact, convergent series solution to the 3-body problem for almost all initial conditions was given by Sundman in 1912. Sundman solved the problem for N = 3 with non zero angular momentum. Sundman’s method was generalized by Wang[40] to the case of N = 3 or N > 3 with zero angular momentum. However, the Sundman and Wang series converge so slowly that they are useless for practical purposes. Therefore, for studies of the N-body problem in dynamics we resort to numerical calculations. The stellar system is a dynamical system of stars usually under the influence of gravity. It contains many stars: Open cluster (102 −104), Globular cluster (104 −106), Dwarf galaxy (109), Galaxy (108−1012). In stellar dynamics all masses are comparable and motions are highly irregular. The investigating the motion of the stars in stellar systems became an N-body problem and N−body simulations are commonly used in astrophysics. It is used to study processes of non-linear structure formation such as galaxy filaments and galaxy halos from the influence of dark matter and to study the dynamical evolution of star clusters.

1.1 Gravitational N− body problem

Consider N point masses mi, i = 1, 2, ...., N in an inertial reference frame in a three dimensional space R3 moving under the influence of mutual gravitational attraction.

1 Texas Tech University, Bimali Jayasinghe, August 2018

By Newton’s law, the gravitational force felt on mass mi by a single mass mj, Fij, is

Gmimj Fij = 3 (qj − qi) (1.1) kqi − qjk

−11 3 −1 −2 where G = 6.67300 × 10 m kg s [39] is the gravitational constant, qi is the position vector of mass mi and kqj − qik is the Euclidean distance between the masses mi and mj. The equation of motion of the particles in a dynamical system can be obtained nd by summing the forces, Fij, over all the particles and applying Newton’s 2 law of motion: 2 N d qi X Gmimj mi 2 = 3 (qj − qi) (1.2) dt kqi − qjk i=1,i6=j Mathematically, the N-body gravitational problem involves the solution of a system of 3N second-order differential equations (1.2). The system given by (1.2) difficult to integrate because of the following two factors [3]:

1. instability

2. the force on each mass depends on the position of all other stars therefore the time needed in calculating the force increases as the square of the number of particles being integrated

These difficulties limit the usefulness of some integration methods, such as the Runge- Kutta method, for the numerical integration of the gravitational problem. The Runge- Kutta is very expensive for large systems. That leads to investigate new methods. Early exploration of the numerical methods was based on trial and error. A major advance was made by Aarseth[1] by introducing variable and individual timesteps for each star. The standard Aarseth scheme was developed using the predictor-corrector method: the Adams-Bashforth (AB) scheme is used for the predictor and the Adams- Moulton (AM) scheme is used for the corrector. In his method, all the particles are treated separately in the integration. Assigning a timestep to each star avoids the calculation of the force on every particle when only the recalculating of the force on a particular particle is required and because of that the integrator can evaluate the force cheaply.

2 Texas Tech University, Bimali Jayasinghe, August 2018

It has been found in practice that not all particles in the system are important when recalculating the force on the particular particle being considered. By taking account of that, Aarseth’s method was improved by Ahmed and Cohen(AC)[3] in 1972. In their method, the force on the particular star is split into two groups called, regular force and irregular force: the force which is due to the distant stars, the regular force, and the force due to the nearest neighborhood of the star, the irregular force. The advantage of AC method is that it reduces the effort of evaluating the force contribution from distant particles. It has been discovered that the time-step criterion of the standard Aarseth scheme was not suited to higher order integrators, and therefore Makino[31] proposed a more reliable criterion. Some numerical methods such as the Hermite scheme allows a large range of timestep. Such methods work better with Makino’s Individual timestep scheme. The main difficulty of this individual timestep scheme is that it requires the force on particle i at time ti + ∆t and to obtain it, the positions of all other particles in the system at that time are needed. With integration schemes such as Runge-Kutta or extrapolation schemes, it is difficult to obtain the position of the other particles at an arbitrary time. Those methods allow to obtain the numerical solution of the differential equation only at discrete points in time. Although the individual timestep scheme is not suitable for all the numerical integration methods, it is a favourable choice because it reduces the cost of force evaluation of the N-body problem. The efficiency and the accuracy are two important factors when developing a numer- ical integrator and the experiment has been continued based on that. For the sake of efficiency and accuracy, block timesteps and time-symmetric integration scheme play an important role. The block timesteps , where all timestep sizes are smaller than a maximum timestep size by an integer power of two, help to maintain the efficiency of the method. The accuracy can be controlled by employing a time- symmetric algorithm[22]. The time-symmetry is a property that guarantee approximate energy conservation. The first successful attempt to construct a time-symmetric integration scheme based on block timesteps was done by Makino, Hut, Kaplan and Saygin in 2006[29]. The computational effort required to integrate a system of ordinary differential

3 Texas Tech University, Bimali Jayasinghe, August 2018

equations to a given accuracy can be reduced using the conservative integration algorithms[25]. Since symplectic methods conserve an approximation to the total energy, the symplectic numerical integrators have been developed.

1.2 Numerical Methods in N Body Simulation Based on these numerical methods, a number of N body simulations have been de- veloped. In N body simulations, the computation of the gravitational forces between N mutually interacting particles at every timestep dominates the operational effort. By the Newton’s law, N ~ X Gmimj F = 3 (qj − qi) (1.3) kqi − qjk i=1,i6=j and the basic method to evaluate the force exactly is direct summation which requires N(N − 1) calculations. Instead of computing the forces exactly by direct summation, 2 one may use approximate but much faster methods, allowing substantially larger N and hence significantly reduced the cost of force calulation. One of the most commonly used approximate method is the Barnes and Hut tree code[6] which requires N ln(N) number of force evaluations. Over the years, a number of methods have been introduced which allow simulations to be performed rapidly, with much better accuracy than the methods mentioned above. The fast multipole method (FMM) of Greengard and Rokhlin[12] was one of them. In this method, the force calculation can be accelerated up to O(N). The oct- tree method, the particle mesh method, the functional expansion method are some of the favorable force calculation methods that are used in N−body simulations. Stellar systems contain many stars and it is hard to follow in a computer. There are two types of N−body calculation with different problems: collisional N−body simulation and colliosionless N−body simulation. Collisional N−body codes are used to model systems in which relaxation is im- portant, in the sense that relaxation time is less than the duration of the numerical integration. In stellar system, relaxation usually means the return of a perturbed system into equilibrium. The relaxation time is a measure of the time it takes for one object in the system (the ”test star”) to be significantly perturbed by other objects in the system (the ”field stars”). The relaxation time may be defined using the crossing

4 Texas Tech University, Bimali Jayasinghe, August 2018

time, the time needed for a typical star to cross the galaxy once, as follows:

0.1N t t (1.4) relax w ln(N) cross

If a stellar system contains N∗ number of stars, the motion of N << N∗ needed to be followed to simulate this type of N−body code. Collisionless N−body simulations are used to model the systems over times much shorter than the relaxation time. It needs to integrate the of

exactly N = N∗ particles numerically. Galaxies (ellipticals & disk galaxies) (N ∼ 106 − 1011), Globular clusters (N ∼ 104 − 106), Galaxy clusters (N ∼ 102 − 103) and Cold Dark Matter haloes (N  1050) are some examples for collisionless systems. This type of codes are easier to write but harder to understand. Recent works show that over 106 particles can be simulated in collisional N-body calculation, while collisionless calculations can reach more than 109 particles [10]. The largest collisionless N-body simulation was done by Tessier et al. in 2009 [10]. They have performed a 70 billion dark-matter particles N−body simulation [38] using the RAMSES code [37]. Harfst el al. experimented nearly 2 × 106 collisional particles using ϕGRAPE code [13] and it was the largest collisional N-body simulation. Some of the N−body codes that are used in both collision and collisionless systems are NBODY [2], MOCCA [11], N-MODY [26], MYRIAD [24]. There are various types of integration methods employed in both types of N−body methods. The most popular integration method of each type are [10],

• the second-order leapfrog integrator or Verlet integrator: heavily used in colli- sionless N-body applications

• the fourth-order Hermite scheme: the integrator of choice for collisional appli- cations

The leapfrog integrator or Verlet integrator is a and it solves the approximate Hamiltonian exactly. It consrerves the energy with fixed timestep and with the time symmetric variable timestep. This method is perfect for the colli- sionless N−body problem but is less useful in collisional N−body problem because of its integration errors on short time scales. Hermite scheme is non-symplectic higher

5 Texas Tech University, Bimali Jayasinghe, August 2018

order integrator. Because of its excellent accuracy, it is popular among collisional N−body problems. This thesis is organised as follows. In chapter 2, symplectic numerical methods for N−body simulations are reviewed: the relationship between the Hamiltonian and N−body problem (2.1), numerical integrators that preseve the symplecticity (2.2), and detail explanation of Individual timestep (2.3). In chapter 3, the methods to compute potential are explained: the foundation of the potential under the motion of gravity: the differential form of the potential(Poisson’s equation) (3.1), solving the Poisson equation under the condition, spherical symmetry: potential-density pairs for spherical potential system (3.2), potential-density pairs for systems other than spherical (3.3), most useful method to find potential-density pair: functional expan- sion method (3.3), and commonly used Poisson solvers in N−body simulation (3.4). Chapter 4 describes the error analysis of the exact force and approximated force (lagged force) using the functional expansion, and numerical results are displayed in chapter 5.

6 Texas Tech University, Bimali Jayasinghe, August 2018

CHAPTER 2 NUMERICAL ORBIT INTEGRATION

Stellar systems contain between 103(open clusters) and 1011(in galaxies) stars so or- bits cannot be computed analytically. Thus, the numerical orbit integration is widely used in stellar dynamics. Choosing an appropriate integrator is quite challenging task when solving stellar dynamic problem. The factors of the best integrator to use for a given problem are discussed in Galactic Dynamics by Binney and Tremaine [8]:

• The smoothness of potential.

• The number of evaluation of the gravitational field. The best integrator evalu- ates the force cheaply.

• The availability of desired memory

• The running time

2.1 Hamiltonian and N− body problem The orbit-integration problems we have to address vary in complexity from fol- lowing a single particle in a given smooth galactic potential, to tens of thousands of interacting stars in a globular cluster, to billions of dark-matter particles in a simu- lation of cosmological clustering. In each case, the system is a Hamiltonian system therefore hamilton’s equations of the motion of the particles can be derived as follows. The potential energy of the above dynamical system is

N X Gmimj Φ = − (2.1) kqi − qjk i=1,i6=j

Through equations1 .2 and2 .1, the following equation is obtained.

2 N d qi X Gmimj mi 2 = 3 (qj − qi) = −∇Φ (2.2) dt kqi − qjk i=1,i6=j

7 Texas Tech University, Bimali Jayasinghe, August 2018

Thus Equations2 .2 are of the form

d2q m i + ∇Φ = 0 (2.3) i dt2

th th Define pi = miq˙i the momentum of the i particle. The equations of motion of i particle becomes ∂H ∂H q˙i = , p˙i = − , (2.4) ∂pi ∂qi p · p where H is the Hamiltonian, defined as the sum of kinetic energy, T = i i , and 2mi Gm m potential energy, Φ = − i j . kqi − qjk

H = T + Φ (2.5)

Thus, the Hamiltonian of the N− body problem is

p · p H(q, p) = + Φ(q) (2.6) 2m

and hamilton’s equations of motion of the N-body problem are

q˙ = ∇pH(q, p) , p˙ = −∇qH(q, p) (2.7)

hamilton’s equations show that the N-body problem is a system of 6N first-order differential equations: with N particles there are 3N coordinates that form the com- ponents of position vector q and 3N components of the corresponding momentum p. Given a phase-space position w = (q, p) at time t, and a timestep h, an algorithm or an integrator is required to generate a new position w¯ = (q¯, p¯) that approximates the true position at time t0 = t + h. A major geometric property of Hamiltonian system is the mapping ϕ :(p, q) → (p¯, q¯) is symplectic. That is, the derivative ∂ϕ ϕ0 = satisfies, ∂(p, q) (ϕ0)T Jϕ0 = J

8 Texas Tech University, Bimali Jayasinghe, August 2018

for all (p, q) where J is the [15]

" # 0 I J = −I 0

Since symplecticity is a characteristic property of Hamiltonian systems we would like to use numerical methods that preserve the symplectic structure of the problem.

2.2 Symplectic Integrators 2.2.1 Modified(or Symplectic) Euler Integrator For partitioned system

q˙ = ∇pH(q, p) (2.8)

p˙ = −∇qH(q, p) (2.9) one variable can be treated by the Backward (or Implicit) (BE) and the other variable by the Forward(or explicit) Euler Method(FE). This method was discovered by de Vogelaere(1965) and called the Symplectic Euler Methods. First, apply FE on p and BE on q to obtain the following system called SE1.

pn+1 = pn − h∇qH(qn+1, pn) (2.10) qn+1 = qn + h∇pH(qn+1, pn)

Alternatively, apply the BE on p and FE on q to obtain SE2.

pn+1 = pn − h∇qH(qn, pn+1) (2.11) qn+1 = qn + h∇pH(qn, pn+1)

A numerical method is symplectic if the numerical mapping ϕh :(pn, qn) →

(pn+1, qn+1) satisfies the condition

0 T 0 (ϕh) Jϕh = J for all (p, q) and step size h.

9 Texas Tech University, Bimali Jayasinghe, August 2018

0 T 0 Considering only the SE1, differentiation wih respect to (pn, qn), (ϕh) Jϕh yields ! ! ! ! 0 T 0 I − h∇qpH 0 0 I I − h∇qpH −h∇pp 0 I (ϕh) Jϕh = = = J −h∇pp I −I 0 0 I −I 0 where the matrices ∇qpH, ∇ppH, ... of partial derivatves are all evaluated at (pn+1, qn). This implies that the SE1 is symplectic. Similarly, this is true for SE2. Hence, both SE1 and SE2 are symplectic. According to symplectic Euler Method1, the vector p after timestep h is

pn+1 = pn − h∇qH(qn, pn+1) (2.12) while the exact result may be written as a Taylor series

h2 p(t ) = p + hp˙ + p¨ + O(h3) (2.13) n+1 n n 2 n

Substituting Hamiltonian equation2 .9, equation2 .13 becomes

2 h 3 p(tn+1) = pn − h∇qH(qn, pn+1) + p¨n + O(h ) | {z } 2 pn+1 h2 = p + p¨ + O(h3) n+1 2 n h2 p(t ) − p = p¨ + O(h3) (2.14) n+1 n+1 2 n

The Local Truncation Error (LTE) of p−step is seen to be O(h2). The vector q after timestep h is

qn+1 = qn + h∇pH(qn, pn+1) (2.15)

and the exact result can be written as a Taylor series

h2 q(t ) = q + hq˙ + q¨ + O(h3) (2.16) n+1 n n 2 n

10 Texas Tech University, Bimali Jayasinghe, August 2018

Substituting Hamiltonian equation2 .8, equation2 .16 becomes

2 h 3 q(tn+1) = qn + h∇pH(qn, pn+1) + q¨n + O(h ) | {z } 2 qn+1 h2 = q + q¨ + O(h3) n+1 2 n h2 q(t ) − q = q¨ + O(h3) (2.17) n+1 n+1 2 n

The Local Truncation Error (LTE) of q−step is O(h2). Therefore, the symplectic Euler Method is first-order integrator. The methods2 .10 and2 .11 are implicit for general Hamiltonian systems. For separable H(p, q) = T (p) + U(q), however, both variants turn out to be explicit. There are more general situations where the symplectic Euler methods are explicit. If, for a suitable ordering of the components,

∂H (q, p) does not depend on pj for j ≥ i, ∂qi then the SE2 is explicit, and the components of pn+1 can be computed one after the other. If, for possibly different ordering of the components,

∂H (q, p) does not depend on qj for j ≥ i, ∂pi

SE1 is explicit.[15]

2.2.2 The Stormer-Verlet Integrator This method is defined using the composition of two Symplectic Euler Methods2 .10 and2 .11. The first version of Verlet method is obtained by taking the composition h h of SE1 for timestep and SE2 for timestep . 2 2

Verlet1 : SE1h/2 ◦ SE2h/2

11 Texas Tech University, Bimali Jayasinghe, August 2018

h p = p − ∇ H(p , q ) n+1/2 n 2 q n+1/2 n h  q = q + ∇ H(p , q ) + ∇ H(p , q ) (2.18) n+1 n 2 p n+1/2 n p n+1/2 n+1 h p = p − ∇ H(p , q ) n+1 n+1/2 2 q n+1/2 n+1

h h The composion of SE2 for timestep and SE1 for timestep is the second version 2 2 of Verlet method.

Verlet2 : SE2h/2 ◦ SE1h/2

h q = q + ∇ H(p , q ) n+1/2 n 2 q n n+1/2 h  p = p − ∇ H(p , q ) + ∇ H(p , q ) (2.19) n+1 n 2 p n n+1/2 p n+1 n+1/2 h q = q + ∇ H(p , q ) n+1 n+1 2 q n+1 n+1/2

Another approach to obtain the Verlet integrator is given below.

2.2.3 Interpretation as splitting method

Split Hamiltonian H to sub Hamiltonians Hp and Hq such that H = Hp + Hq where p · p H = T (p) = and H = Φ(q). Then, hamilton’s equations for H are p 2m q p

q˙ = p (2.20) p˙ = 0 and hamilton’s equations for Hq are

q˙ = 0 (2.21) p˙ = −∇Φ(q) hamilton’s equations for Hq and Hp are called drift and kick equations respectively. The exact flows ϕ[D] :(p, q) → (p¯, q¯) and ϕ[K] :(p, q) → (p¯, q¯) of these drift and kick, which both have a constant time derivative, are:

12 Texas Tech University, Bimali Jayasinghe, August 2018

( q¯ = q + hp ϕ[D] = (2.22) p¯ = p and ( q¯ = q ϕ[K] = (2.23) p¯ = p − h∇Φ(q)

The flow ϕ[D] is called drift step because the position changes but the momentum does not and the flow ϕ[K] is called the kick step because the momentum changes but the position does not. The SE and Verlet integrators can be obtained by the kick and drift steps. SE1 or [D] [K] ”drift-kick” Euler integrator is the composition of ϕh/2 and ϕh/2:

[D] [K] SE1 = ϕh/2 ◦ ϕh/2 (2.24)

h h p¯ = p − ∇Φ(q) , q¯ = q + p¯ (2.25) 2 2

[K] [D] SE2 or ”kick-drift” Euler integrator is the composition of ϕh/2 and ϕh/2:

[K] [D] SE2 = ϕh/2 ◦ ϕh/2 (2.26)

h h q¯ = q + p , p¯ = p − ∇Φ(q¯) (2.27) 2 2

These formulae are precisely those which build up the formulae SE1 and SE2 above. The two versions of Verlet method are obtained by the formulae,

[K] [D] [K] Verlet1 = ϕh/2 ◦ ϕh ◦ ϕh/2 (2.28)

h h p = p − ∇Φ(q); q¯ = q + hp ; p¯ = p − ∇Φ(q¯) 1/2 2 1/2 1/2 2

13 Texas Tech University, Bimali Jayasinghe, August 2018

[D] [K] [D] Verlet2 = ϕh/2 ◦ ϕh ◦ ϕh/2 (2.29)

h h q = q + p; p¯ = p − h∇Φ(q ); q¯ = q + p¯ 1/2 2 1/2 1/2 2 This algorithm is sometimes called ”kick-drift-kick”’or ”drift-kick-drift” Verlet in- tegrators. Both versions of Verlet integrators are symplectic since the composition of symplec- tic mappings is symplectic. The LTE in q− step of the Verlet integrator is O(h4) and, the LTE in p− is O(h2)[30]. In dynamics simulations, the global error (GE) is typi- cally far more important than the LTE. The GE of q and p steps of Verlet integrator is O(h2)[15]. Thus, the Verlet integrator is an explicit second-order integrator. The Verlet integrator is popular among the stellar dynamics because of the following advantage features:

(i) The Verlet integrator is second-order accurate

(ii) It is symmetric with respect to the direction of time. This leads an important geometric property of the numerical integrator in the space, called reversibility.

(iii) It is symplectic. Therefore, the GE in energy never rise with time.[41][7]

(iv) It needs no storage of previous timesteps

One of the major limitation of this method is that it works well only with fixed timesteps. It is no longer symplectic when the timestep is varied.

2.3 Individual timesteps The crossing time, the time needed for a typical star to cross the galaxy once, of orbits near center is much smaller than the crossing time in the outer envelope because the density in stellar system varies. For example, in a typical globular cluster the crossing time at the center is ≤ 1Myr, while the crossing time near the tidal radius, the outer limit of the cluster where the density drops to zero, is ≈ 100 Myr.In order to exploit this feature, and economize on the expensive force calculation, each particle is

14 Texas Tech University, Bimali Jayasinghe, August 2018

assigned its own time-step, individual timestep, which is related to the orbital time- scale. Thus, the aim is to ensure convergence of the force with the minimum number of force evaluations. It is easy to explain individual timestep scheme if the integrator is an interpolating polynomial. To advance a particle, the most recent interpolating polynomials of all the other particles are used to predict their location, and then the forces between the given particle and the other particles are being evaluated. The direct summation Poisson solver makes this possible. The most sensitive timestep of the given particle, ith particle, is

v u (2) (1) 2 u |Fi||Fi | + |Fi | hi = tη (2.30) (1) (3) (2) 2 |Fi ||Fi | + |Fi |

(1) (2) (3) where Fi is the force of particle i (the total force acting on it), Fi , Fi and Fi are the first, second and third derivatives of the force [1]. The commonly used value of parameter η is η = 0.02 and it controls the accuracy of the integration [1]. Depending on different density profiles, the ITS scheme reduces the computational complexity from O(N 2) to O(N 4/3), and a larger gain can be achieved with centrally concentrated systems [28]. In practice, in order to group the particle according to their timesteps, the block timestep scheme is often employed, permitting particles in the same timestep group to be advanced at the same time [16],[17]. The timestep defined in the block timestep scheme is h h = max (2.31) n 2n−1

where n is the level of integration and hmax is a predefined maximum timestep. In general, n can be any number of levels. However, it is rare for more than about 12 levels to be populated in a realistic simulation with N ≤ 1000, increasing by a few 4 levels for N w 10 [1].

15 Texas Tech University, Bimali Jayasinghe, August 2018

CHAPTER 3 COMPUTATION OF POTENTIAL

A galaxy is a bounded system consisting of stars, gas, dust, and dark matters. Depending upon the visual appearance, galaxies are categorized mainly into three types:

1. Elliptical Galaxy

2. Spiral Galaxy

3. Irregular Galaxy

The pioneering work of classifying the galaxies is done by Edwin Hubble (1926). Using several photographic images, he arranged the different groups of galaxies in what became known as the Hubble sequence. The number of stars in a galaxy vary from 107 to 1012 and they travel around the galaxy under the force of gravity. Stars are so much denser than the interstellar gas and therefore the mass distribution is used to find the gravitational force and, from that the orbits of stars can be calculated.

3.1 Motion Under Gravity The Newton’s law of universal gravitation states that a particle attracts every other particle in the universe with a force which is directly proportional to the product of their masses and inversely proportional to the square of the distance between their centers. Thus, the equation of the force between two point masses m1 and m2 with the position vectors x1 and x2 takes the form:

Gm m F~ (r) = − 1 2 rˆ (3.1) r2

where r = x1 − x2. By adding the force on point masses, m, at the position x, the force per unit mass caused by a density distribution ρ(x0) is

Z x − x0 F (x) = −G m ρ(x0)d3x0 (3.2) |x − x0|3

16 Texas Tech University, Bimali Jayasinghe, August 2018

The potential of mass m at the position x is given by the integral over the density ρ(x0) at all other masses:

Z ρ(x0) Φ(x) ≡ −G d3x0, (3.3) |x − x0| and therefore the force per unit mass becomes

F = −∇Φ (3.4)

Since ! ! 1 x − x0 1 ∇ = − and ∇2 = 0 (3.5) |x − x0| |x − x0|3 |x − x0| the differential form of the potential, Poisson’s equation, is obtained by applying ∇2 to both sides in equation 3.2. ∇2Φ = 4πGρ (3.6)

In general, it is extremely complicated to solve the Poisson equation for Φ given ρ. However, under certain conditions, solutions to the Poisson equation are fairly straightforward. In particular, under spherical symmetry, the Poisson’s equation can be solved using Newton’s theorems.

3.2 Potentials of Spherical Systems Newton proved two useful theorems about the gravitational field for a spherical galaxy:

(i) The gravitational force inside a spherical shell of uniform density is zero

(ii) The gravitational force outside any spherically symmetric object is the same as if all its mass had been concentrated in the center

From the two Newton theorems, it follows that the gravitational force toward the center in any spherical shell with density distribution ρ(r0) is the sum of the inward forces from all the masses inside that radius.

17 Texas Tech University, Bimali Jayasinghe, August 2018

Thus, the gravitational force on a unit mass at radius r is,

GM(r) F (r) = − rˆ (3.7) r2 where Z r M(r) = 4π y2ρ(y)dy (3.8) 0 The total gravitational potential at r is obtained by adding the contributions to the potential produced by shells with inside of r and with outside of r:

G Z r Z ∞ dM(y) Φ(r) = − dM(y) − G r y 0 r (3.9) h1 Z r Z ∞ i = −4πG y2ρ(y) dy + y ρ(y)dy r 0 r

Note that Poisson’s equation(3.6) is obtained by taking ∇2 of both sides of equation (3.9), and therefore the general solution to the Poisson equation is given in equation (3.9). v2(r) The centripetal acceleration c of a star moving with the circular speed v (r), r c the speed of a star in a circular orbit, is provided by the gravitational force. Therefore, the circular speed vc(r) is,

GM(r) v2(r) = (3.10) c r The circular speed measures the mass interior to r. Another important quantity is the escape speed ve, speed such that, starting at r, a star will reach r → ∞ with speed 0, defined by p ve(r) ≡ 2|Φ(r)| (3.11)

The escape speed at r depends on the mass both inside and outside r. The Poisson equation(3.6) shows that the relationship between the potential Φ(x) and the density distribution ρ(x). Thus, for a given potential Φ, the corresponding density distribution can be calculated, and then equation(3.8) can be used to find the corresponding mass distribution in a spherically symmetric system. Some commonly

18 Texas Tech University, Bimali Jayasinghe, August 2018

used potentials are given below.

(a) Point Mass The potential at distance r from a point mass M is,

GM Φ(r) = − (3.12) r

This is called as Keplerian.

(b) The Plummer Model A Plummer model or Plummer Sphere is a simple model for spherical galaxies which was first used H. C. Plummer to fit observations of globular clusters. The gravitational potential is

GM Φ(r) = −√ (3.13) a2 + r2

where a is the Plummer scale length. The laplacian of above function in spherical coordinates is 2 2 1 d  2 dΦ 3GMa ∇ Φ = − 2 r = 5 (3.14) r dr dr (r2 + a2) 2

Thus, from the Poisson’s equation, ∇2Φ = 4πGρ, the corresponding density to the potential (3.13) is 3Ma2 1 ρ(r) = 5 (3.15) 4π (r2 + a2) 2

(c) Homogeneous Sphere For a constant density ρ, the mass distribution in a homogeneous sphere is 4 M(r) = πr3ρ (3.16) 3 From equation(3.9), the gravitational potential of a homogeneous sphere of radius a is  2 1 2 −2πGρ(a − 3 r ) r < a Φ(r) = 4πGρa3 (3.17) − r > a 3r

The circular velocity, vc, is r 4πGρ v = r (3.18) c 3

19 Texas Tech University, Bimali Jayasinghe, August 2018

and it increases linearly with radius. The orbital period is then defined by the density ρ 2πr r 3π T = = (3.19) vc Gρ and it is independent of the radius of its orbit. The inverse of the angular fre- quency of a circular orbit is

r r 3 = = 0.4886(Gρ)−1/2 (3.20) vc 4πGρ

If a mass is released from rest in a gravitational field of a homogeneous sphere, then its equation of motion is given by the gravitational acceleration

d2r GM(r) 4πGρ = − = − r (3.21) dt2 r2 3

which is the equation of motion of a harmonic oscillator (¨x = −ω2x) with oscil- 2π r 3π lation period = . This is the same time as is required for a full circular ω Gρ orbit in equation (3.19). Then, no matter the initial value of r is, the time required for any particle to fall into the center(r = 0) is, T = 0.767(Gρ)−1/2 (3.22) 4 The times in equations (3.20) and (3.22) are similar and this shows that the time taken for a particle to complete a significant fraction of orbit is ∼ (Gρ)−1/2. This result holds for inhomogeneous systems, so ρ is replaced by the mean density ρ¯ inside the location of the particle. Thus, the estimation of the crossing time (sometimes called the dynamical time) is

−1/2 tcross w tdyn w (Gρ¯) (3.23)

This relation is used for the characterization of systems like open clusters, glob- ular clusters, bulges of galaxies, or clusters of galaxies.

(d) Two-power density model The total amount of energy emitted per unit time

20 Texas Tech University, Bimali Jayasinghe, August 2018 by a star, galaxy, or other astronomical object is called luminosity. Many galaxies have luminosity profiles which can be fitted with power law profiles. Several inves- tigations show that spherical potentials for density distributions can be described by a power law of the form [8]

r α ρ(r) = ρ 0 (3.24) 0 r

By the equation (3.8), the mass distribution inside r is

4πρ rα M(r) = 0 0 r3−α (3.25) 3 − α

The distribution M(r) diverges for r → ∞ at large radii if α > 3, and therefore we assume that α < 3. The above density model can be improved to following model called two-power density model so that the total mass remains finite for large r. The density distribution of two-power density model is

ρ ρ(r) = 0 (3.26) (r/a)α(1 + r/a)β−α where a is a scaling radius. Note that, the mass distribution of the above model is finite for small radius and large radius when α < 3 and β ≥ 3 respectively. The popular cases of the two-power density model are:

• β = 4 Dehnen models • β = 4 and α = 1 Hernquist model • β = 4 and α = 2 Jaffe model • β = 3 and α = 1 NFW model

Denhen models with α in the range (0.6, 2) provide reasonable models of the centers of elliptical galaxies while NFW is used to model Dark halos [8]. Hern- quist model is very useful tool to investigate the dynamical structure of galaxies. Since the distribution function of Hernquist model is known analytically for both isotropic and anisotropic velocity dispersions, it can be used to construct equilib- rium N-body realizations and to perform linear stability analyses [19].

21 Texas Tech University, Bimali Jayasinghe, August 2018

3.3 Potential-density pairs for other systems There are several other potential-density pairs modeled for other galaxies. Disk galaxies almost certainly are axisymmetric (though highly flattened) and flattened potentials are used to modeled them. For axisymmetric systems, the suitable coor- dinates are the cylindrical coordinates (R, θ, z), and Φ = Φ(R, z). Potential-density pairs for a few flattened systems are listed here:

MODEL POTENTIAL (Φ(R, z)) DENSITY (ρ(R, z))

√ √ GM  b2M  aR2 + (a + 3 z2 + b2)(a + z2 + b2)2 Miyamoto-Nagai − √ q √ 4π [R2 + (a + z2 + b2)2]5/2(z2 + b2)3/2 R2 + (a + z2 + b2)2

−2  z2  v2 (2q2 + 1)R2 + R2 + (2 − q )z2 Logarithmic 1 v2 ln R2 + R2 + 0 φ c φ 2 0 c 2 2 2 2 2 −2 2 qφ 4πGqφ (Rc + R + z qφ )

Note that, in logarithmic model Rc and v0 are constants and qφ is the axis ratio of the equipotential. A spiral galaxy is also flat, and can be represented by a flattened potential. The disk of the spiral galaxy is generally very flat and we will describe them as infinitely thin. Thin disks are supposed to be axisymmetrical and they are described by a surface mass density Σ(R). A simple axisymmetric potential model is Kuzmin model(1956),

GM Φ(R, z) = − (3.27) pR2 + (a + |z|)2

in which the corresponding surface density is

aM Σ(R) = (3.28) 2π(R2 + a2)3/2

Galaxy is not always a simple structure to be modeled, However, for many purposes it is enough to represent a galaxy by a simple model that has identical gross structure as galaxy.

22 Texas Tech University, Bimali Jayasinghe, August 2018

3.4 Potential from Functional Expansions The common theme of many models is to describe the galaxy using potential-density pairs. A richer and more flexible way of representing the potential and density of galaxies is to use basis-function. The advantage of this approach is that the potential and density can be approximated to arbitrarily high accuracy. We seek solutions in series to the Poisson equation ∇2Φ = 4πGρ. We find pairs of basis functions Φβ(x) and ρβ(x) for β = 1, 2, ...... , that satisfy,

2 ∇ Φβ = 4πGρβ (3.29)

and determine coefficients aβ by writing the density and potential as the sum

X ρ(x) = aβρβ(x) (3.30) β

X Φ(x) = aβΦβ(x). (3.31) β

∗ To find the coefficients aβ, multiply equation (3.30) by −Φα and integrate over all space to obtain X sα = Mαβaβ, (3.32) β where Z 3 ∗ sα = − d xΦα(x)ρ(x) Z (3.33) 3 ∗ Mαβ = − d xΦα(x)ρβ(x)

Consider Mαβ in equation (3.33). Using Poisson’s equation and applying the diver- gence theorem, it becomes the following equation.

1 Z M = − d3xΦ∗ (x)∇2Φ (x) αβ 4πG α β (3.34) 1 I 1 Z = − Φ∗ (x)∇Φ (x) · d2S + d3x∇Φ∗ (x) · ∇Φ (x) 4πG α β 4πG α β

23 Texas Tech University, Bimali Jayasinghe, August 2018

∗ This shows that Mαβ = Mαβ since the first term (or surface term) vanishes when

the integral is taken over all space. So, if M is a matrix with entries Mαβ then it is Hermitian. An important advantage of M is that it does not depend on the galactic mass distribution so it can be computed easily once the basis potentials Φα have been chosen.

The coefficients aβ can now be found by solving the linear equation (3.32). There are two kind of basis functions enable this process. They are,

(i) Bi-orthonormal basis function

(ii) Designer basis function

Bi-orthonormal basis functions are chosen to be such that M is the unit matrix. Then the second equation of (3.33) gives, Z 3 ∗ − d xΦα(x)ρβ(x) = δαβ (3.35) which guarantee the bi-orthonormality of basis functions. When M is a unit matrix, equation (3.32) has the trivial solution sα = aβ. Bi-orthonormal basis function can 2 easily be determined by taking eigenfunctions of the Hermitian operator ∇ as Φα.

In this case ρα ∝ Φα and the orthogonality of Φα and ρβ is assured by the well-known spectral theorem, the eigenspace of Hermitian operator is mutually orthogonal. In general, it is hard find a and s since they are vectors of infinite dimension. So, it is necessary to work with finite-dimensional vectors and matrices. The idea of designer basis functions is to choose the basis functions such that the galaxy can be accurately represented by the smallest possible number of them. One of the approach to find this type basis function is described by an example. Spherical harmonics are eigenfunctions of the angular part of ∇2. Therefore, one obvious form of designer basis function is

Φnlm(x) = Ylm(θ, φ)Fnm(r) (3.36)

Since spherical harmonics, Ylm(θ, φ) are orthogonal, M is block diagonal and equation (3.32) can be solved. The basis sets given by Clutton-Brock (1973), Polyachenko & Shukhman (1981) and Hernquist & Ostriker (1992) are of the type (3.36), with the 2 Fnm being eigenfunctions of the radial part of ∇ .

24 Texas Tech University, Bimali Jayasinghe, August 2018

3.5 Poisson solvers for N− body code Stellar system contains many particles and therefore N-body codes, computer pro- grams that follow the motion of large number of masses under their mutual gravita- tional attraction, are widely used. To produce a code that will efficiently calculate the gravitational forces on a large number of bodies is a major challenge. In collisional or collisionless N−body simulation, the gravitational force on each particle is derived from the current positions of the particles, then use it to advance the position and momentum of each particle for a short time and find new forces. Poisson solvers are used to calculate the force. Most important types of Poisson solvers are described below.

3.5.1 Direct Summation The force on particle i can be evaluated by summing the contribution from all the other particles as follows:

~ X Gmj Fi = 3 (rj − ri) (3.37) krj − rik i6=j

Each force requires N − 1 number of distances |qi − qj| and each distance can be used twice: force on particle i to particle j and particle j to particle i. Therefore, 1 a minimum of 2 N(N − 1) computations needed to evaluate the force over all the particles. Thus, the work per timestep increases with N as N 2. The force is singular if the particles i and j approach each other closely. That is, if the distance between two particles is very close the force becomes very large. The force be softened to remove the singularity at small distances:

~ X rj − ri Fi = GmjSF (3.38) |rj − ri| i6=j

where SF is called the force softening kernel. The commonly used form of S is

1 S(r) = −√ (3.39) r2 + 2

Largest direct summation simulations still limited to N ≤ 106. The largest sustained

25 Texas Tech University, Bimali Jayasinghe, August 2018

N-body model of a globular star cluster is Heggies simulation. It took about 2 yr and 8 months to simulate N = 484, 710 stars, using the code NBODY6 [18].

3.5.2 Tree Code Barnes & Hut introduced a structure to organize particles: Place imaginary cube (called root node) around the simulation and divide it into eight sub cubes called nodes. If more than one particle is assigned to any cell, it is split into eight daughter cells, and this procedure is continued recursively until all the members have been allocated single cells. A typical cell arrangement is shown in figure below for eight particles distributed in 2D.

We locate the center of mass of the particles in each cube and evaluate the sums

X X α α X α α α M0 = mα; Mij = mαxi xj ; Mijk = mαxi xj xk α α α and so forth, where sums run over the particles in the cube and xα is the position vector of particle α relative to the cube’s center of mass. This highrachy of sums is related to the multipole expansion and they are called as cartesian multipole moments. Tree method is more efficient than the direct summation since work involved in determining the forces on particles is O(N ln(N)). There are several hierarchical tree- based algorithms. The Barnes-Hut Algorithm, the Fast Multipole Method (FMM), and the Parallel Multipole tree Algorithm are few of them. Among these three al- gorithms, the FMM is the fastest algorithm since the computation complexity of the FMM is of order O(N)[36].

26 Texas Tech University, Bimali Jayasinghe, August 2018

3.5.3 Other Poisson Solvers There are more N- body algorithms. Particle-mesh (PM) code, Particle-Particle or Particle-Mesh (P3M) method, Nested Grid Particle-Mesh (NGPM) method and Self-Consistent Field (SCF) method are few of them. The Particle-Mesh method treats the force as a field quantity by approximating it on a mesh. The Fourier method is used to solve the Poisson equation and po- tential/force is computed on a fixed grid. The force is then interpolated to particle positions for moving the particles. By using a suitable interpolating function, density field, the source for gravitational potential, can be computed on the same mesh/grid from particle positions. This method is easy to implement and faster than the tree method. By adding a correction to the mesh force, PM method can be improved to a much accurate method called Particle-Particle/Particle-Mesh (P3M) method. In this method, the force of nearby particles is calculated by direct summation and the PM method is used to find the force of distant particles. The number of operations required for P3M is proportional to N(n(R)) where n(R) is the average number of neighboring particles within a distance R [5]. P3M method is widely used in cosmo- logical simulations. Over the years, a nested grid codes are developed for use in cosmology [4]. It is also used to study galaxy collisions [9]. There are several advantages of implementation of a nested-grid algorithm. One of them is very high force resolution. Several other advantages are discussed by Splinter(1996) [34]. The Self-Consistent Field (SCF) method is used to study collisionless stellar sys- tems. It is similar to Tree Codes in time-integration and round-off errors and the cost of force calculation of this type of algorithm is O(N). Hernquist & Ostriker (1992) improved this method by expanding the density and potential of Poisson’s equation with a set of basis functions that fully expands the angular and radial dependence. Hernquist & Ostriker’s set of basis functions, called “ultraspherical polynomials”, resembled the system being studied- for example, spherical galaxies [20].

27 Texas Tech University, Bimali Jayasinghe, August 2018

CHAPTER 4 DETAILED ANALYSIS OF BLOCK TIMESTEP VERLET WITH LAGGED EXPANSION COEFFICIENTS

The Hamiltonian of a particle in the stellar system is given by

1 H(x, v) = v2 + Φ(x) (4.1) 2 where x is the position vector of the particle, v is the velocity vector of the particle, and Φ is the potential. As we described in chapter 2, by alternating kick and drift steps, we can construct a symplectic Verlet integrator. The simplest and most widely used integrators are the ”kick-drift-kick” Verlet and the ”drift-kick-drift” Verlet. The ”kick-drift-kick” Verlet 1 1 is defined by kicking for h, drifting for h and then kicking for h: 2 2 1 v = v − ha(x) 1/2 2 0 x = x + hv1/2 (4.2) 1 v0 = v − ha(x0) 1/2 2

0 where a(x) = ∇xH(x , v) = ∇Φ(x) is the force or acceleration of particle. An equally good form is ”drift-kick-drift” Verlet:

1 x = x + hv 1/2 2 0 v = v − ha(x1/2) (4.3) 1 x0 = x + hv0 1/2 2

Most codes for simulating collisionless stellar systems use the Verlet integrator because of all the advantages mentioned in previous chapters. Stellar system contains billions of particles. It is characterized by a range in density that gives rise to different time-scales for significant changes of the orbital parameters. It is extremely inefficient to integrate all of the particles with the shortest timestep

28 Texas Tech University, Bimali Jayasinghe, August 2018

needed for any particle, so integrators must allow individual timesteps for each par- ticle. We can make this possible using the block timestep scheme. We now describe how one version of this scheme works with the Verlet integrator.

4.1 Careful description of Algorithms

For any particle, we calculate the estimated value of its time-step, hest,i, from following equation. !1/2 −3 kxik2 hest,i = 10 (4.4) kaik2 We assign each particle to one of K +1 classes, such that particles in class k is defined as & !' h k = log2 (4.5) hest,i

k with timestep hk ≡ 2 h for k = 0, 1, 2, ..., K. Thus h is the shortest timestep (class 0) K and 2 h is the longest (class K). The force at the initial time t0, a(x0), is evaluated 1 using Plummer potential (3.13), and each particle is kicked by the impulse h a(x ), 2 k 0 corresponding to the first part of kick-drift-kick Verlet step.

1 vk = vk + h a(xk) (4.6) 1,start 0 2 k 0

Then every particle is drifted through time h as below,

k k k x1 = x0 + hv0,start 1 xk = xk + hvk + hh a(xk) (4.7) 1 0 0 2 k 0

0 and find the forces, a(x1), on the particles in class 0, so these particles can be kicked 1 by ha(x0). 2 1 1 v0 = v0 + ha(x0) (4.8) 1 1,start 2 1 1   v0 = v0 + h a(x0) + a(x0) (4.9) 1 0 2 0 1

29 Texas Tech University, Bimali Jayasinghe, August 2018

This is the end of the first Verlet step. We again kicked the particles in class 0 by 1 ha(x0) to start the second Verlet step: 2 1 1 v0 = v0 + ha(x0) = v0 + ha(x0) (4.10) 2,start 1 2 1 1,start 1

0 So, the particles in class 0 is kicked by ha(x1), which is the sum of the kicks at the end of their first Verlet step and the start of their second. Next we drift all particles through h a second time with the most recent evalua- tion of x, v, a. Therefore, we drifted the particles in class 0 by h with most recent 0 evaluation of velocity, v2,start

0 0 0 x2 = x1 + hv2,start 1 x0 = x0 + hv0 + h2a(x0) (4.11) 2 1 1 2 1

k and the rest of the particles in classes k = 1, 2, 3, ...... , K with the velocity v1,start, which was calculated in the first step:

k k k x2 = x1 + hv1,start 1 xk = xk + hvk + hh a(xk) (4.12) 2 1 0 2 k 0

0 1 Then, we find the forces, a(x2) and a(x2), on the particles in both class 0 and class 1 respectively, with the most updated position. The particles of class 0 are kicked by 1 ha(x0), 2 2 1 v0 = v0 + ha(x0) (4.13) 2 2,start 2 2 1   v0 = v0 + h a(x0) + a(x0) (4.14) 2 1 2 1 2

30 Texas Tech University, Bimali Jayasinghe, August 2018

1 and the particles of class 1 are kicked by − h a(x1), 2 1 2 1 v1 = v1 + h a(x1) (4.15) 2 1,start 2 1 2 1   v1 = v1 + h a(x1) + a(x1) (4.16) 2 0 2 1 0 2 which completes the second Verlet step. We begin the third Verlet step by kicking the particles in both class 0 and class 1 1 1 again with ha(x0) and h a(x1) respectively. So, we will obtain the following 2 2 2 1 2 equations:

1 v0 = v0 + ha(x0) = v0 + ha(x0) (4.17) 3,start 2 2 2 2,start 2 1 v1 = v1 + h a(x1) = v1 + h a(x1) (4.18) 3,start 2 2 1 2 1,start 1 2

Then we drift all the particles by h for third time and find the force of particles in class 0. After an interval 3h the particles in class 0 are kicked.

k = 3

k = 2

k = 1

k = 0

0 1 2 3 4 5 6 7 8 t h

Figure 4.1: The block timestep scheme, for a system with 4 classes of particles

31 Texas Tech University, Bimali Jayasinghe, August 2018

Figure4 .1 shows the graphical representation of block timestep for a system with 4 classes of particles. At the fourth timestep, the particles in classes 0, 1, 2 are kicked. K This process continues until all the particles are due for a kick, after a time hK ≡ 2 h. 1 The final kick for particles in class k is h a(xk ), which completes 2K−k Verlet 2 k k+1 steps for each particle. The complete algorithm of block timestep Verlet method (BTS Verlet) is described in table4 .1.

Table 4.1: Algorithm of BTS Verlet method

1. Find a for all particles

2. Classify all particles

3. for n = 0 : 2K+1 if n == 0 h kick every particle by k a 2 else find i, highest class to be updated for k = 1 : i compute a for all particles in class k if n > 0 or n < 2K+1 kick all particles in class k by hka else h kick all particles in class k by k a 2 end end for n = 0 : 2K+1 drift all particles by h end end

32 Texas Tech University, Bimali Jayasinghe, August 2018

4.2 Error Analysis with Exact Forces It can be shown that the error in the BTS Verlet is of the same order as in the Stormer Verlet.

Given a sequence t0, t1 = t0 +h, t2 = t0 +2h, ..., where h > 0 is the timestep, let the 0 exact solution and the numerical estimate of position of particles in class 0 be x (tn) 0 and xn, and the exact solution and the numerical estimate of velocity of particles in 0 0 the class 0 be v (tn) and vn for n = 0, 1, ... . We obtain the numerical estimate of position of particles in class 0 in the first and second steps in equations (4.7) and (4.11).

1 x0 = x0 + hv0 + h2a(x0) 1 0 0 2 0 1 x0 = x0 + hv0 + h2a(x0) 2 1 1 2 1

This procedure can be continued to produce approximants at t3, t4, t5 and so on. In general, we obtain the recursive scheme

1 x0 = x0 + hv0 + h2a(x0 ) (4.19) n+1 n n 2 n

for n = 0, 1, 2, 3, ... The exact force of a particle, a(t), depends not only the time t but also the positions of paricles in all the classes. Therefore,

a(t) = a(t, x0, x1, ...... , xK ) (4.20) and the first and second derivatives are given by

K d ∂a X ∂a dxi a(t) = + (4.21) dt ∂t ∂xi dt i=0 K K K d2 ∂2a X X ∂2a dxj dxi X ∂a d2xi a(t) = + + (4.22) dt2 ∂t2 ∂xi∂xj dt dt ∂xi dt2 j=0 i=0 i=0

33 Texas Tech University, Bimali Jayasinghe, August 2018

The first few terms of the Taylor expansion of the exact value about t = t0 + nh and t = t0 + (n − 1)h are

h2 h3 d h4 d2 x0(t ) = x0(t ) + hv0(t ) + a0(t ) + (a0(t )) + (a0(t )) + ... (4.23) n+1 n n 2 n 3! dt n 4! dt2 n and

h2 h3 d h4 d2 x0(t ) = x0(t ) − hv0(t ) + a0(t ) − (a0(t )) + (a0(t )) + ... n−1 n n 2 n 3! dt n 4! dt2 n (4.24)

d d2 Replacing the terms (a0(t )) and (a0(t )) by equations (4.21) and (4.22), and dt n dt2 n adding the above two equations gives

0 0 0 2 0 4 x (tn+1) + x (tn−1) = 2x (tn) + h a (tn) + O(h ) (4.25)

Thus,

0 0 0 2 0 4 x (tn+1) = −x (tn−1) + 2x (tn) + h a (tn) + O(h ) (4.26)

Rearranging above equation, we will get

1h i x0(t ) = x0(t ) − x0(t ) − 2x0(t ) + x0(t ) n+1 n 2 n+1 n n−1 1h i + h2 a0(t ) + x0(t ) − x0(t ) + O(h4) (4.27) n 2 n+1 n−1

The central difference approximaton to the second derivative gives:

x0 − 2x0 + x0 a0(t ) = n+1 n n−1 + O(h2) (4.28) n h2 and x0 − x0 v0(t ) = n+1 n−1 + O(h2) (4.29) n 2h

34 Texas Tech University, Bimali Jayasinghe, August 2018

Therefore,

1 h i x0(t ) = x0(t ) − x0(t ) − 2x0(t ) + x0(t ) +h2 a0(t ) n+1 n 2 n+1 n n−1 n | 2 0 {z 4 } h a (tn)−O(h ) 1 h i + x0(t ) − x0(t ) +O(h4) 2 n+1 n−1 | 0 {z 3 } 2hv (tn)−O(h ) h2 = x0(t ) + hv0(t ) + a0(t ) − O(h3) + O(h4) (4.30) n n 2 n

Hence, the local error in position of class 0 of the BTS Verlet method is O(h3). The local error in position of class 1 of the BTS Verlet method can be obtained by following the same steps. First, we need to find the position of class 1 in the nth step. By following the equations (4.7) and (4.12), the numerical estimate of position of particles in class 1 in the first and second steps are:

1 x1 = x1 + hv1 + (2)(h)2a(x1) (4.31) 1 0 0 2 0 1 x1 = x1 + hv1 + (2)(h)2a(x1) (4.32) 2 1 0 2 0

1 Replacing x1 by equation (4.31), we have

1 1 x1 = x1 + hv1 + (2)(h)2a(x1) + hv1 + (2)(h)2a(x1) 2 0 0 2 0 0 2 0 1 = x1 + (2h)v1 + (2h)2a(x1) (4.33) 0 0 2 0

The velocity of class 1 is updated at the timestep 2h, and therefore the position in the third step is given by

1 1 k x3 = x2 + hv3,start 1 x1 = x1 + hv1 + hh a(x1) (4.34) 3 2 2 2 1 2

35 Texas Tech University, Bimali Jayasinghe, August 2018

The postion in the fourth step is,

1 x1 = x1 + hv1 + hh a(x1) 4 3 2 2 1 2 1 1 = x1 + hv1 + hh a(x1) + hv1 + hh a(x1) 2 2 2 1 2 2 2 1 2 1 = x1 + (2h)v1 + (2h)2a(x1) (4.35) 2 2 2 2

Since the timestep for particles in class 1 is h1 = 2h, we consider the position at second step, fourth step, sixth step and so on to get the following recursive formula.

1 x1 = x1 + (2h)v1 + (2h)2a(x1) 2 0 0 2 0 1 x1 == x1 + (2h)v1 + (2h)2a(x1) 4 2 2 2 2 . . 1 x1 = x1 + (2h)v1 + (2h)2a(x1 ) (4.36) 2(n+1) 2n 2n 2 2n

By the Taylor expansion,

(2h)2 (2h)3 d x1(t ) = x1(t ) + (2h)v1(t ) + a1(t ) + (a1(t )) 2(n+1) 2n 2n 2 2n 3! dt 2n (2h)4 d2 + (a1(t )) + ... (4.37) 4! dt2 2n

and

(2h)2 (2h)3 d x1(t ) = x1(t ) − (2h)v1(t ) + a1(t ) − (a1(t )) 2(n−1) 2n 2n 2 2n 3! dt 2n (2h)4 d2 + (a1(t )) + ... (4.38) 4! dt2 2n

d d2 As we did before, replacing the terms (a1(t )) and (a1(t )) by equations (4.21) dt n dt2 n and (4.22), and adding the above two equations gives

1 1 1 2 1 4 x (t2(n+1)) + x (t2(n−1)) = 2x (t2n) + (2h) a (t2n) + O((2h) ) (4.39)

36 Texas Tech University, Bimali Jayasinghe, August 2018

By the central difference approximaton, we will obtain the equations

0 1 0 x − 2x2n + x a1(t ) = 2(n+1) 2(n−1) + O((2h)2) (4.40) 2n (2h)2

x1 − x1 v1(t ) = 2(n+1) 2(n−1) + O((2h)2) (4.41) 2n 2(2h) Rearranging equation (4.39) and applying equations (4.40) and (4.41), we get

1 h i x1(t ) = x1(t ) − x1(t ) − 2x1(t ) + x1(t ) +(2h)2 a1(t ) 2(n+1) 2n 2 2(n+1) 2n 2(n−1) 2n | 2 1 {z 4 } h a (t2n)−O((2h) ) 1 h i + x1(t ) − x1(t ) +O((2h)4) 2 2(n+1) 2(n−1) | 1 {z 3 } 4hv (t2n)−O((2h) ) (2h)2 = x1(t ) + hv1(t ) + a1(t ) − O((2h)3) + O((2h)4) (4.42) 2n 2n 2 2n

Thus, the local error in position of class 1 of the BTS Verlet method is O((2h)3) which means O(h3). The recursive formula for the position in any class k is

k k k k 1 k 2 k x k = x k + (2 h)v k + (2 h) a(x k ) (4.43) 2 (n+1) 2 n 2 n 2 2 n

By the Taylor expansion, the exact positions can be written as

k 2 k 3 k k k k (2 h) k (2 h) d k x (t k ) = x (t k ) + (2 h)v (t k ) + a (t k ) + (a (t k )) 2 (n+1) 2 n 2 n 2 2 n 3! dt 2 n k 4 2 (2 h) d k + (a (t k )) + ... 4! dt2 2 n (4.44)

k 2 k 3 k k k k (2 h) k (2 h) d k x (t k ) = x (t k ) − (2 h)v (t k ) + a (t k ) − (a (t k )) 2 (n−1) 2 n 2 n 2 2 n 3! dt 2 n k 4 2 (2 h) d k + (a (t k )) + ... 4! dt2 2 n (4.45)

37 Texas Tech University, Bimali Jayasinghe, August 2018

Now we will follow the steps that we did before. First, we substitute the terms 2 d k d k (a (t k )) and (a (t k )) by equations (4.21) and (4.22), then we add the above dt 2 n dt2 2 n two equations to get

k k 1 k 2 k k 4 x (t2k(n+1)) + x (t2k(n−1)) = 2x (t2kn) + (2 h) a (t2kn) + O((2 h) ) (4.46)

So,

k k 1 k 2 k k 4 x (t2k(n+1)) = − x (t2k(n−1)) + 2x (t2kn) + (2 h) a (t2kn) + O((2 h) ) (4.47)

k k 1 h k k k i k 2 k x (t k ) = x (t k ) − x (t k ) − 2x (t k ) + x (t k ) +(2 h) a (t k ) 2 (n+1) 2 n 2 2 (n+1) 2 n 2 (n−1) 2 n

| 2 k {z k 4 } h a (t2kn)−O((2 h) )

1 h k k i k 4 + x (t k ) − x (t k ) +O((2 h) ) 2 2 (n+1) 2 (n−1)

| k k {z k 3 } 2(2 h)v (t2kn)−O((2 h) ) k 2 k k (2 h) k k 3 k 4 = x (t k ) + hv (t k ) + a (t k ) − O((2 h) ) + O((2 h) ) 2 n 2 n 2 2 n (4.48) since x0 − 2x1 + x0 k 2k(n+1) 2kn 2k(n−1) k 2 a (t k ) = + O((2 h) ) (4.49) 2 n (2kh)2 x1 − x1 k 2k(n+1) 2k(n−1) k 2 v (t k ) = + O((2 h) ) (4.50) 2 n 2(2kh) We see that the local error in position of any class k of the BTS Verlet method is O((2kh)3). That is, O(h3).

38 Texas Tech University, Bimali Jayasinghe, August 2018

The velocity approximation of first two steps in class 0 is given in equations (4.9) and (4.14). They are,

1   v0 = v0 + h a(x0) + a(x0) (4.51) 1 0 2 0 1 1   v0 = v0 + h a(x0) + a(x0) (4.52) 2 1 2 1 2

In general, we can write

1   v0 = v0 + h a(x0 ) + a(x0 ) (4.53) n+1 n 2 n n+1

0 where a(xn+1) is the force or the acceleration at the current step. By the Taylor expansion,

∂ a(x0 ) = a(x0(t )) + (x0 − x0(t )) a(x0(t )) + O((x0 − x0(t ))2) n+1 n+1 n+1 n+1 ∂x n+1 n+1 n+1 (4.54) We have proved previously that the error of postion in class 0 is O(h3) and therefore 0 0 3 xn+1 − x (tn+1) = O(h ). Thus the equation (4.54) gives

0 0 3 a(xn+1) = a(x (tn+1)) + O(h ) (4.55)

0 Replacing a(xn+1) in equation (4.53) by the above equation, the velocity approxi- mate in class 0 becomes

h hh i v0 = v0 + a(x0 ) + a(x0(t )) + O(h3) n+1 n 2 n 2 n+1 hh i = v0 + a(x0 ) + a(x0(t )) + O(h4) (4.56) n 2 n n+1

The Taylor expansion of the exact value of the velocity about t = t0 + nh is

d h2 d2 h3 d3 v0(t ) = v0(t ) + h (v0(t )) + (v0(t )) + (v0(t )) + ... n+1 n dt n 2 dt2 n 3! dt3 n h2 d h3 d2 = v0(t ) + ha0(t ) + (a0(t )) + (a0(t )) + ... (4.57) n n 2 dt n 3! dt2 n

39 Texas Tech University, Bimali Jayasinghe, August 2018

The central difference formula gives

d a0(t ) − a0(t ) (a0(t )) = n+1 n + O(h) (4.58) dt n h

d Replacing (a0(t )) by previous equation we have, dt n " # h2 a0(t ) − a0(t ) v0(t ) = v0(t ) + ha0(t ) + n+1 n + O(h) + O(h3) n+1 n n 2 h hh i = v0(t ) + a0(t ) + a0(t ) + O(h3) (4.59) n 2 n n+1

Thus, the local error of the velocity in class 0 is O(h3). The velocity of particles in class 1 is updated at timesteps 2h, 4h, 6h, and so on. The first step of the velocity approximate in class 1 is given in equation (4.16).

1   v1 = v1 + (2h) a(x1) + a(x1) (4.60) 2 0 2 0 2

The approximate velocity at the timestep 4h is

1 v1 = v1 + h a(x1) 4 3,start 2 1 4 1 1 = v1 + h a(x1) + h a(x1) 2 2 1 2 2 1 4 1   = v1 + (2h) a(x1) + a(x1) (4.61) 2 2 2 4

1 where v3,start is given in equation (4.18). Therefore, the recursive formula of the velocity in class 1 is

1   v1 = v1 + (2h) a(x1 ) + a(x1 ) (4.62) 2(n+1) 2n 2 2n 2(n+1)

40 Texas Tech University, Bimali Jayasinghe, August 2018

1 The first few terms of Taylor expansion of a(x2(n+1)) is given by

∂ a(x1 ) = a(x1(t )) + (x1 − x1(t )) a(x1(t )) 2(n+1) 2(n+1) 2(n+1) 2(n+1) ∂x 2(n+1) 1 1 2 + O((x2(n+1) − x (t2(n+1))) ) (4.63)

1 1 3 Since the error of postion in class 1, x2(n+1) − x (t2(n+1)), is O((2h) ) the equation (4.63) becomess

1 1 3 a(x2(n+1)) = a(x (t2(n+1))) + O((2h) ) (4.64)

1 Replacing a(x2(n+1)) in equation (4.62) and simplifying we obtain

1   v1 = v1 + (2h) a(x1 ) + a(x1(t )) + O((2h)4) (4.65) 2(n+1) 2n 2 2n 2(n+1)

The Taylor expansion of the exact value of the velocity of class 1 is

(2h)2 d (2h)3 d2 v1(t ) = v1(t ) + ha1(t ) + (a1(t )) + (a1(t )) + ... (4.66) 2(n+1) 2n 2n 2 dt 2n 3! dt2 2n

d where (a1(t )) can be evaluated by the central difference formula: dt 2n

d a1(t ) − a1(t ) (a1(t )) = 2(n+1) 2n + O(2h) (4.67) dt 2n 2h

d Thus, substituting (a1(t )) in equation (4.66) as before, we will get dt 2n 1 h i v1(t ) = v1(t ) + (2h) a1(t ) + a1(t ) + O((2h)3) (4.68) 2(n+1) 2n 2 2n 2(n+1)

So, the local error of the velocity in class 1 is O((2h)3) or O(h3). As the equations (4.53) and (4.62), the numerical velocity of particles in class k is given by k k 1 k  k k  v k = v k + (2 h) a(x k ) + a(x k ) (4.69) 2 (n+1) 2 n 2 2 n 2 (n+1) k By the Taylor expansion, the force of particles in class k at the current step, a(x2k(n+1)),

41 Texas Tech University, Bimali Jayasinghe, August 2018

is

k k k k ∂ k a(x k ) = a(x (t k )) + (x k − x (t k )) a(x (t k )) 2 (n+1) 2 (n+1) 2 (n+1) 2 (n+1) ∂x 2 (n+1) k k 2 + O((x2k(n+1) − x (t2k(n+1))) ) k k 3 = a(x (t2k(n+1))) + O((2 h) ) (4.70)

Thus the equation (4.69) changes to the equation

k k 1 k  k k  k 4 v k = v k + (2 h) a(x k ) + a(x (t k )) + O((2 h) ) (4.71) 2 (n+1) 2 n 2 2 n 2 (n+1)

d k By expanding the exact value of the velocity of class k, and replacing (a (t k )) dt 2 n by the central difference formula, we obtain the equation

k 2 k k k k (2 h) h k k i k 3 v (t k ) = v (t k ) + (2 h)a (t k ) + a (t k ) + a (t k ) + O((2 h) ) 2 (n+1) 2 n 2 n 2 2 n 2 (n+1) (4.72)

Hence, the local error of the velocity in class k is O((2kh)3) which means O(h3). The local truncation error is the error between the approximated value and previous value, assuming the previous value is exact. The accumulation of the local error over all of the iterations, assuming perfect knowledge of the true solution at the initial timestep is called global error. The global error is hard to estimate but important to to know. For multistep methods, if the local error is O(hp+1) then the global error is O(hp)[35]. Also, the numerical method is said to be pth−order method. We have estimated the local error in position and velocity of BTS Verlet method and it is O(h3). Thus, the global error in position and velocity is O(h2). The total energy E is the sum of kinetic energy and potential energy:

1 E = v2 + Φ(x) (4.73) 2

The approximate energy is given by the following equation

1 E = v2 + Φ(x ) (4.74) approx 2 approx approx

42 Texas Tech University, Bimali Jayasinghe, August 2018

3 3 Since vapprox = vexact+O(h ) and xapprox = xexact+O(h ) the above equation becomes,

1 E = [v + O(h2)]2 + Φ(x + O(h2)) (4.75) approx 2 exact exact

Simplify and use the Taylor series to get

1 ∂Φ E = [v2 + 2v O(h2) + O(h4)] + Φ(x ) + O(h2) + O(h4) approx 2 exact exact exact ∂x 1 = v2 + Φ(x ) + O(h2) (4.76) 2 exact exact

Thus the error in energy, ∆E = Eapprox − Eexact, is

∆E = O(h2) (4.77)

Therefore, BTS Verlet method is a second-order method. It is hard to find the exact force practically, therfore approximation methods (Pois- son solvers) are used in N−body simulations. Several methods to find the force are discussed in the previous chapter and one of the most accurate and flexible methods is functional expansion. The error analysis of BTS Verlet method when the force is approximated by functional expansion, called lagged force, is described in the next section.

4.3 Error Analysis with lagged Forces Assume that the potential and density have the bi-orthogonal basis set, and there- fore they can be expanded in the series:

X Φ(r) = AnΦn(r) (4.78) n X ρ(r) = Anρn(r) (4.79) n where An is called the expansion coefficient and (Φm, ρn) = δmn.

43 Texas Tech University, Bimali Jayasinghe, August 2018

The density is also given in the following equation.

N Xs ρ(r) = δ(r − ri) (4.80) i=1 where Ns is the number of stars.

We are able to find the inner product with potential basis function Φm and ρ in two different ways:

N Xs (Φm, ρ) = (Φm, δ(r − ri)) i=1 N Xs = Φm(ri) (4.81) i=1 or

X (Φm, ρ) = An(Φm, ρn) n

= Am (4.82)

Thus, the expansion coefficient Am is given by

N Xs Am = Φm(ri) (4.83) i=1

The cost of computation of expansion coefficient can be reduced by partitioning it k th th into K + 1 levels. Let An be the n coefficient expansion of k level, where k =

0, 1, 2, ....., K. Thus, we may calculate the An by adding the coefficient at each level. That is, 0 1 2 K An = An + An + An + ...... + An (4.84)

k We only update An at level k and use most recently computed previous values for the other levels. In that way we could save the cost of computation. Now, we can find the potential numerically as follows. If there are L number of

44 Texas Tech University, Bimali Jayasinghe, August 2018

expansion coefficients, then K L X X k Φlag = AnΦn (4.85) k=0 n=1 The exact potential is L X Φexact = AnΦn (4.86) n=1 Thus, L K X  X k  Φexact − Φlag = An − An Φn (4.87) n=1 k=0 k The first few terms of the Taylor exapansion of An(tcurr) is

d 1 d2 Ak (t ) = Ak (t ) + (t − t ) Ak (t ) + (t − t )2 Ak (t ) + .... n curr n mrk curr mrk dt n mrk 2 curr mrk dt2 n mrk (4.88) Therefore,

K K X X  d A − Ak = A − Ak (t ) + (t − t ) Ak (t ) n n n n mrk curr mrk dt n mrk k=0 k=0 1 d2  + (t − t )2 Ak (t ) + .... 2 curr mrk dt2 n mrk K K X X d = A − Ak (t ) − (t − t ) Ak (t ) n n mrk curr mrk dt n mrk k=0 k=0 K 1 X d2 − (t − t )2 Ak (t ) + .... (4.89) 2 curr mrk dt2 n mrk k=0

But from the equation (4.83)

N Xs An = Φn(ri(current timestep)) (4.90) i=1

Nk k X An = Φn(ri(most recent kick step for k)) (4.91) i=1

45 Texas Tech University, Bimali Jayasinghe, August 2018

Therefore,

K Ns K Nk X k X X X An − An(tmrk) = Φn(ri(curr)) − Φn(ri(mrk)) (4.92) k=0 i=1 k=0 i=1 and by the Taylor series

K Ns 2 X k X ∂Φn T ∂ Φn An − An(tmrk) = (ri(curr)) − ri(mrk)) + (δr) (δr) + ... (4.93) ∂ri ∂ri∂ri k=0 i=1 where ”mrk” means most recent kick and ”curr” is stand for current timestep. We have proved earlier that

3 ri(curr)) − ri(mrk) = O((tcurr − tmrk) ) (4.94)

By the chain rule,

N d Xk Ak = v · ∇Φ (r ) (4.95) dt n i n i i=1 N d2 Xk   Ak = a · ∇Φ (r ) + vT (∇∇Φ (r ))v (4.96) dt2 n i n i i n i i i=1

Replacing above three equations in equation (4.89), we will obtain the following equation:

K K Nk X k 3 X X An − An = O((tcurr − tmrk) ) − (tcurr − tmrk) vi · ∇Φn(ri) k=0 k=0 i=1 K N 1 X Xk   − (t − t )2 a · ∇Φ (r ) + vT (∇∇Φ (r ))v + .... 2 curr mrk i n i i n i i k=0 i=1

46 Texas Tech University, Bimali Jayasinghe, August 2018

K K Nk X k X X An − An = −(tcurr − tmrk) vi · ∇Φn(ri) k=0 k=0 i=1 K N 1 X Xk   − (t − t )2 a · ∇Φ (r ) + vT (∇∇Φ (r ))v 2 curr mrk i n i i n i i k=0 i=1 3 + O((tcurr − tmrk) )

= O((tcurr − tmrk)) (4.97)

Hence,

Φexact − Φlag = O((tcurr − tmrk)) (4.98)

47 Texas Tech University, Bimali Jayasinghe, August 2018

CHAPTER 5 NUMERICAL EXPERIMENTS

5.1 Generation of Samples from DF We use the Inverse transform sampling technique and the acceptance/rejection method to generate the sample particles from the DF. Inverse transform sampling method Inverse transform sampling is a basic method for pseudo-random number sampling, i.e. for generating sample numbers at random from any probability distribution given its cumulative distribution function (CDF). It takes uniform samples of a number u between 0 and 1, interpreted as a probability, and then returns the largest number x from the domain of the distribution P (X) such that P (−∞ < X < x) ≤ u.

Proposition 1. If Y has a uniform distribution on [0, 1], U(0, 1), and if X has a cu- −1 mulative distribution FX , then the random variable FX (Y ) has the same distribution as X .

The inverse transform sampling method works as follows:

1. Generate a random number ui from the standard uniform distribution in the interval (0, 1), U(0, 1).

−1 2. Compute the value xi such that FX (xi) = ui. Thus, xi = FX (ui).

3. Take xi to be the random number drawn from the distribution described by FX .

Accepetance/Rejection sampling method The accepetance/rejection sampling method generates sampling values from a tar- get distribution X with arbitrary probability density function (PDF) f(x) by using a proposal distribution Y with probability density g(x). The idea is that we can gen- erate a sample value from X by instead sampling from Y and accepting the sample f(x) from Y with probability , repeating the draws from Y until a value is accepted. Mg(x) f(x) Here, M is finite bound on the likelihood ratio , and it is a constant. g(x)

48 Texas Tech University, Bimali Jayasinghe, August 2018

5.1.1 Position Distribution We generate the sample of stars using the Plummer potential. As we described in Chapter 3, the Plummer model is defined in the spherical stellar system and the Plummer potential and corresponding density distribution are

GM Φ(r) = −√ (5.1) a2 + r2 3Ma2 1 ρ(r) = 5 (5.2) 4π (r2 + a2) 2

The position of the particle, x = (x, y, z) in spherical coordinates (r, θ, φ) is:

x = r sin(φ) cos(θ) y = r sin(φ) sin(θ) (5.3) z = r cos(φ) where r is distributed by mass distribution function, M(r) on [0, ∞), θ is distributed by 1 f (θ) = on [0, 2π] (5.4) θ 2π and φ is distributed by

1 f (φ) = sin(φ) on [0, π] (5.5) φ 2 We will find the mass distribution using the Newton’s theorem. According to Newton’s theorem Z r M(r) = 4π dr0r02ρ(r0) (5.6) 0 Therefore the mass distribution or the cumulative distributin function(CDF) of Plum- mer potential can be found by solving the following integral:

Z r 3Ma2 r02 M(r) = 4π dr0 02 2 5 0 4π (r + a ) 2 Mr3 = 3 (5.7) (r2 + a2) 2

49 Texas Tech University, Bimali Jayasinghe, August 2018

For simplicity, let M = 1. Thus, the mass distribution function of Plummer potential be

r3 M(r) = 3 (5.8) (r2 + a2) 2

We need to generate the r, θ and φ to calculate the position. We use inverse transform to generate the position. The inverse of M(r) is obtained by doing some algebra as follows. r3 Let µ = M(r) = 3 . (r2 + a2) 2 By squaring both sides, we will obtain the equation

r6 µ2 = (r2 + a2)3 (r2 + a2)3µ2 = r6 (r6 + 3r4a2 + 3r2a4 + a6)µ2 = r6 (r6 + 3r4a2 + 3r2a4 + a6)µ2 = r6 (1 − µ2)r6 − 3r4a2µ2 − 3r2a4µ2 − a6µ2 = 0 r6 r4 r2 (1 − µ2) − 3µ2 − 3µ2 − µ2 = 0 (5.9) a6 a4 a2 r Let t = . Then the equation (5.11) becomes a

(1 − µ2)t6 − 3µ2t4 − 3µ2t2 − µ2 = 0 (5.10)

If λ = t2,

(1 − µ2)λ3 − 3µ2λ2 − 3µ2λ − µ2 = 0 (5.11)

We can find λ(µ) by solving equation (5.13). Therefore, r = at = apλ(µ) which will be the inverse of M(r). We will find the inverse DF of θ and φ from equations (5.6) and (5.7). As we

50 Texas Tech University, Bimali Jayasinghe, August 2018

described above, we may generate r using the inverse DF as below.

−1 ri = M (ui) (5.12)

th where ri is the radius of i particle and ui is the radomly chosen number from uniform distribution U(0, 1).

Similarly, the angles θi and φi can be obtained

−1 θi = fθ (ui) (5.13) −1 φi = fφ (ui) (5.14)

Thus, we will be able to generate the position x of desired sample.

Figure 5.1: Plummer sphere for 1000 stars 2D and 3D plot

5.1.2 Velocity Distribution

The velocity v = (vx, vy, vz) in spherical coordinates is given as follows.

vx = v sin(φ) cos(θ)

vy = v sin(φ) sin(θ) (5.15)

vz = v cos(φ)

51 Texas Tech University, Bimali Jayasinghe, August 2018

where v = kvk2. The velocity v of a particle has the distribution with PDF

2 2 2 7/2 f(r, v) = Cv (ve(r) − v ) (5.16)

where ve(r) is the escape velocity and C is a constant. This DF is obtained using the polytropes of Plummer model [8].

We write v = ve(r)w then w is distributed by

g(w) = w2(1 − w2)7/2 (5.17)

The w is chosen from the accept/reject method. We will choose the target distribution as probability density function g(w) and the proposal distribution as U(0, 1), the uniform distribution over the interval (0, 1). Let pmax be the maximum of the function g(w). The algorithm to obtain a sample from distribution with density g(w) using samples from distribution U(0, 1) is as follows:

Table 5.1: Algorithm of accept/reject method

w = random(0, 1) u = random(0, 1) g(w) Y = pmax

if u ≤ Y accept w else reject w repeat

We can find v using the equation

v = ve(r)w (5.18)

52 Texas Tech University, Bimali Jayasinghe, August 2018 where s 2G ve(r) = √ (5.19) a2 + r2 r, θ and φ are generate using the inverse transform sampling method as we described in section 5.1.1. Thus, the velocity v can be found from equation5 .15.

5.2 Experiments We will be able to generate the sample as we described above using the DF. We first consider in a static, spherically symmetric gravitational field. Such fields are appropriate for globular clusters, which are usually nearly spherical. The motion of a star in a centrally directed gravitational field is greatly simplified by the law of conservation of angular momentum. Thus if

r = reˆr (5.20) denotes the position vector of the star with respect to the center, and the radial acceleration is

g = g(r)eˆr, (5.21) the equation of motion of star is

d2r = g(r)eˆ . (5.22) dt2 r

The algorithm to generate the orbit of a single star using Verlet integration is given below.

53 Texas Tech University, Bimali Jayasinghe, August 2018

Table 5.2: Algorithm of Verlet integration

1. generate a star

2. Find a

3. for i = 1 : n h kick by a 2 drift by ha compute the a with the new drift h kick by a 2 end

We know that the cross product of any vector with itself is zero. Therefore, we have d  dr  dr dr dr r × = × + r × = reˆ × g(r)eˆ = 0 (5.23) dt dt dt dt dt r r That implies, r × r˙ is some constant vector, say J.

dr r × = J (5.24) dt

Of course, J is the angular momentum per unit mass, a vector perpendicular to the plane defined by the star’s instantaneous position and velocity vectors. Since this vector is constant, we conclude that the star moves in a plane, the orbital plane. The kinetic energy+potential energy is conservative along the orbit:

Kinetic Energy + Potential Energy = constant = E (5.25)

We find that the constant E is simply the numerical value of the Hamiltonian, which we refer to as the energy of the orbit. In the phase space, x, v, the constant of motion in a given force is any function C(x, v; t) such that the phase-space and time are constant along the stellar orbit.

54 Texas Tech University, Bimali Jayasinghe, August 2018

Figure 5.2: Orbit of a single star

Also, an integral of motion I(x, v) is any function of the phase space coordinate alone that is constant along an orbit. Any orbit in any force field always has six independent constants of motion. That is the phase space has six dimensions. By taking care of six independent integrals, the orbit in phase space reduces to a one dimensional curve [8]. Figure5 .2 can be regarded as projection of this curve.

The total angular momentum J = kx × vk2 is almost constant along the orbit as shown in figure5 .3.

55 Texas Tech University, Bimali Jayasinghe, August 2018

Figure 5.3: Angular momentum of a single star

56 Texas Tech University, Bimali Jayasinghe, August 2018

1 In any static potential Φ(x), the Hamiltonian H(x, v) = kvk2+Φ or the numerical 2 2 energy E is a constant. Since the numerical integrator is symplectic, the energy is bounded.

Figure 5.4: Total Energy of a single star

The stellar system consists billions of stars. Our next interest is whether the Verlet integrator works for many star. To check that, we want to see global error decreasing at expected rate. The global error (GE) of the Verlet integrator is O(h2). Thus, if the method works properly, the global error must decrease with O(h2). That is,

GE = O(h2) (5.26)

We could test this by calculating the GE with several constant timesteps {h1, h2, h3,

57 Texas Tech University, Bimali Jayasinghe, August 2018

....., hn}. The GE changes when the timestep changes. The hypothesis is

2 GEi = Chi (5.27)

where C is a constant. By taking the logarithm on both sides, we will obtain the equation

log(GEi) = log(C) + 2 log(hi) (5.28)

This is an equation of a line, y = mx + b. Therefore, when we graph y = log(GEi) vs x = log(hi) the slope of the line, m, gives the order of the GE. The graph of log(GEi) vs log(hi) in Verlet integration for 100 stars is shown below.

Figure 5.5: The order of the Global Error I

58 Texas Tech University, Bimali Jayasinghe, August 2018

time(T ) Since h = , the equation (5.28) gives no of steps(nSteps)

 T  log(GE ) = log(C) + 2 log i nSteps

log(GEi) = log(C) + 2 log(T ) −2 log(nSteps) (5.29) | {z } constant

The equation5 .29 is also an equation of a line with the slope negative of order of the

GE which is −2. The graph of the log(GEi) vs log(nSteps) in Verlet integration for 100 stars are shown in figure5 .6.

Figure 5.6: The order of the Global Error II

Thus, the figures5 .5 and5 .6 imply, that the Verlet integration is working properly for many stars. The graph of total energy and total angular momentum for 100 stars

59 Texas Tech University, Bimali Jayasinghe, August 2018

with the timestep 0.1 is given in the following figure. As in the single star, the total energy and total angular momentum is conservative.

Figure 5.7: The Total Energy of many stars

60 Texas Tech University, Bimali Jayasinghe, August 2018

Figure 5.8: The Total Energy and Angular Momentum of many stars

61 Texas Tech University, Bimali Jayasinghe, August 2018

The star cluster contains many stars bound by gravitation. It is characterized by a range in density. Therefore, the crossing time of orbits near center is much smaller than the crossing time in the outer envelope. Thus, it is not good to use the same timestep for all stars. We will use individual timestep method. The BTS Verlet method that is described in Chapter 4 has GE O(h2). To check the accuracy of the ITS method we can use the same technique that we used before.

We may calculate the GE with several timestep {h1, h2, h3, ...... , hn} and the GE at each timestep is O(h2). Therefore,

2 GEi = Chi

log(GEi) = log(C) + 2 log(hi) (5.30)

Thus, the plot of log(GEi) vs log(hi) have the slope 2.

Figure 5.9: The GE of BTS Verlet method with several timestep

62 Texas Tech University, Bimali Jayasinghe, August 2018

The figure5 .9 shows the plot of log(GE) vs log(h) for 256 stars with 4 classes in BTS Verlet method. The slope is 2.05 and therefore BTS Verlet method working properly. The running time of BTS Verlet method is less than that of the Verlet method. We may calculate the number of force evaluations of each method and it will be shown in the table below.

Table 5.3: Number of force evaluations METHOD NO of FORCE EVAL Verlet 1300736 BTS Verlet 828657

Thus, the BTS Verlet method is more efficient than the Verlet method.

63 Texas Tech University, Bimali Jayasinghe, August 2018

CHAPTER 6 CONCLUSION

N−body simulations are used to simulate stellar systems such as star clusters, galaxies, and galaxy clusters. One of the most widely used timestepping algorithms for N−body algorithms is the Verlet method. Because of the wide range of orbital timescales found in a stellar system, the block timestep (BTS) modification to the Verlet algorithm often used. While the behavior of the standard Verlet method has been thoroughly studied both analytically and experimentally, we are unaware of any systematic theoretical analysis of the BTS Verlet algorithm. In this thesis, we have done a formal error analysis of the BTS Verlet algorithm, and, additionally, studied the effect of approximate ”lagged” force evaluations on this algorithm. In this thesis, the numerical orbit integrator to solve the N body problem is de- scribed in Chapter 2. The Verlet integrator is widely used in problems as a numerical integrator because of its appealing features such as good numerical stability, time reversibility and preservation of the symplectic form on phase space. It is also eco- nomical of memory, since it needs no storage of previous timesteps. The GE of Verlet integrator is O(h2). The density in a stellar system varies with distance from its center. Thus, we assign individual timestep for each star. The BTS Verlet method has same advantages as in Verlet method except the symplecticity. The BTS Verlet is not symplectic since the timestep is varied. In Chapter 3, the methods to compute the force are explained. To find the force, we need to solve the Poisson equation, ∇2Φ = 4πGρ. We may solve it using some conditions or using numerical methods. The efficiency of simulation can be improved by reducing the cost of force calculation. The detailed algorithm is explained in Chapter 4. The error analysis with exact forces and lagged force are also described in that Chapter. The lagged force cal- culation is a new N body solver which is developed using the functional expansion method. The GE of BTS Verlet method with exact forces is O(h2). We can maintain the same error using lagged force. The detailed proofs are given in Chapter 4. The numerical experiments are given in Chapter 5. We have experimented with

64 Texas Tech University, Bimali Jayasinghe, August 2018

both Verlet and BTS Verlet methods using the Plummer potential and have concluded that both methods are second order integrator.

6.1 Future Work We have analyzed the lagged force calculation analytically. The next step to do is to test the lagged force calculation

65 Texas Tech University, Bimali Jayasinghe, August 2018

BIBLIOGRAPHY

[1] Aarseth S.J., Gravitational N−Body Simulations: Tools and Algorithms, Cam- bridge University Press.

[2] Aarseth S.J., From NBODY1 to NBODY6: The Growth of an Industry, PASP 111, 1333

[3] Ahmad A., Cohen L., A Numerical Integration Scheme for the N−Body Gravi- tational Problem, Hunter College of the City University of New York, 1972

[4] Anninos, P., Norman, M. L., Clarke, D. A., Hierarchical numerical cosmol- ogy with hydrodynamics: Methods and code tests, Astrophysical Journal, Part 1 (ISSN 0004-367X), vol. 436, no. 1, p. 11-22

[5] Bagla J.S., Padmanabhan T., Cosmological N-Body Simulations, Cosmological N-Body Simulations

[6] Barnes, J., Hut P., A hierarchical O(NlogN) force-calculation algorithm, Nature 324(6096):446-449 (1986)

[7] Benettin G., Giorgilli A., On the Hamiltonian interpolation of near-to-the- identity symplectic mappings with application to symplectic integration algo- rithms, A. J Stat Phys (1994) 74: 1117

[8] Binney J., Tremaine S., Galactic Dynamics, Princeton University Press, 2008.

[9] Chan, K. L., Chau, W. Y., Jessop, C., Jorgenson, M., Groupp, W. D., Multigrid, particle-mesh scheme for N-body simulation, Journal of the Royal Astronomical Society of Canada, Vol. 80, No. 5, p. 279, 1986

[10] Dehnen W., Read J.I., N-body simulations of gravitational dynamics, 2011, Eur. Phys. J. Plus 126, 55

[11] Giersz M., Monte Carlo Simulations of Star Clusters, Dynamics of Star Clusters and the Milky Way, ASP Conference Series, Vol. 228.

[12] Greencard L., Rokhlin V., A Fast Algorithm for Particle Simulations, Journal of computational physics 73 (2), 325-348

[13] Harfst S., Gualandris A., Merritt D., Spurzem R., Zwart S.P., Berczik P., Per- formance Analysis of Direct N−Body Algorithms on Special-Purpose Supercom- puters, NewAstron.12:357-377, 2007

66 Texas Tech University, Bimali Jayasinghe, August 2018

[14] Hairer E., Lubich C., Wanner G., Geometric numerical integration illustrated by the Stormer-Verlet Method, Cambridge University Press, 2003.

[15] Hairer E., Lubich C., Wanner G., Geometric Numerical Integration Structure- Preserving Algorithms for Ordinary Differential Equations, Springer, 2006.

[16] Hayli, A., Les Nouvelles Me thodes de la Dynamique Stellaire, 1967

[17] Hayli, A., Numerical Solution of Ordinary Differential Equations, 1974

[18] Heggie, D. C., Towards an N−body model for the globular cluster M4, MNRAS 445, 3435-3443 (2014)

[19] Hernquist L., An Analytical Model for Spherical Galaxies and Bulges, The As- trophysical Journal, 356:359-364,1990

[20] Hernquist L., Ostriker J.P., A self-consistent field method for galactic dynamics, Astrophysical Journal, Part 1 (ISSN 0004-637X); 386; 375-397, 1992.

[21] Hubble E. P., Extra-Galactic nebulae, Astrophysical Journal, 64, 321-369 (1926)

[22] Hut P., Makino J., McMillan S., Building a better leapfrog, Astrophysical Journal, Part 2 - Letters (ISSN 0004-637X), vol. 443, no. 2, p. L93-L96 (1995)

[23] Kinoshita, H., Nakai, H., Numerical Integration Methods in Dynamical Astron- omy, and Dynamical Astronomy, Vol. 45, p.231 (1989)

[24] Konstantinidis S., Kokkotas K.D., MYRIAD: A new N-body code for simulations of star clusters, A&A Volume 522, November 2010, A70

[25] Kotovych O., Bowman J.C.,An Exactly Conservative Integrator for the N−Body Problem, J. Phys. A.: Math. Gen, 2002

[26] Londrillo P., Nipoti C., N−MODY: a code for collisionless N-Body simulations in modified Newtonian dynamics, arXiv:0803.4456 [astro-ph]

[27] Makino J., Optimal Order and Time-Step Criterion for AARSETH-Type N− body Integrators, The Astrophysical Journal, 369:200-212,1991

[28] Makino, J., Hut, P., Performance Analysis of Direct N-Body Calculations, As- trophysical Journal Supplement Series (ISSN 0067-0049), vol. 68, Dec. 1988

[29] Makino J., Hut P., Kaplan M., Saygin H., A Time-Symmetric Block Time-Step Algorithm for N−Body Simulations, New Astron. 12 (2006) 124-133

67 Texas Tech University, Bimali Jayasinghe, August 2018

[30] Mazur A.K., Common Algorithms Revisited: Accuracy and Optimal timesteps of St ormer-Leapfrog Integrators, J. Comp. Phys. (1997) 136(2) 354-365

[31] Meyer, Kenneth, Hall, Glen, Offin, Daniel C., Introduction to Hamiltonian Dy- namical Systems and the N−Body Problem, Springer 2009

[32] Saha P., Designer basis functions for potentials in galactic dynamics, Monthly Notices of the Royal Astronomical Society (ISSN 0035-8711), vol. 262, no. 4, p. 1062-1064

[33] Sparke L.S., Gallagher J.S. III, Galaxies in the Universe: An Introduction, Cam- bridge University Press, 2007

[34] Splinter R.J., A Nested Grid Particle-Mesh Code for High Resolution Simulations of Gravitational Instability in Cosmology, Mon.Not.Roy.Astron.Soc. 281 (1996)

[35] S¨uliE., Mayers D.F., An Introduction to Numerical Analysis, Cambridge Uni- versity Press, 2003

[36] Tang H., N-Body Simulation Using Tree Codes, Final Report for Class Project ECE 572

[37] Teyssier R., Cosmological hydrodynamics with adaptive mesh refinement A new high resolution code called RAMSES, Astronomy and Astrophysics, v.385, p.337- 364 (2002)

[38] Teyssier R., Pires S., Prunet S., Aubert D., Pichon C., Amara A., Benabed K., Colombi S., Refregier A., Starck J.-L., Full-sky weak-lensing simulation with 70 billion particles, A&A, 497 2 (2009) 335-341

[39] Trenti M., Hut P., Gravitational N−body Simulations, published in Scholarpedia, 3(5):3930 accepted May 20, 2008

[40] Wang, Qiudong, The global solution of the N-body problem, Celestial Mechanics and Dynamical Astronomy. 50 (1): 7388 (1991)

[41] Yoshida, Haruo, Recent Progress in the Theory and Application of Symplectic Integrators, Celestial Mechanics and Dynamical Astronomy, Volume 56, Issue 1-2, pp. 27-43

68