Perturbation Theories in Astrophysics: From Large-Scale Structure To Compact Objects
Dissertation
Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University
By
Xiao Fang, B.S., M.S.
Graduate Program in Physics
The Ohio State University
2018
Dissertation Committee:
Christopher M. Hirata, Advisor Todd A. Thompson John F. Beacom Amy L. Connolly Stuart A. Raby c Copyright by
Xiao Fang
2018 Abstract
Although the ΛCDM model seems to be a very successful model for our observed
Universe, puzzles remain puzzling. Current and future cosmology analyses aim to test the ΛCDM model, understand the nature of dark matter and dark energy, and learn about the beginning of the Universe. All of these require better modeling of cosmological probes. In this thesis, I focus on improving cosmology analyses in dif- ferent ways. On the one hand, ongoing and upcoming large-scale structure surveys provide the opportunity to measure the properties of the Universe to unprecedented precision. On the other hand, compact objects, responsible for some of the most vio- lent phenomena observable on cosmological scales, play a unique role in determining the expansion history of the Universe. I present our work on developing accurate and extremely efficient tools for analyzing the next generation cosmology datasets, and characterizing systematic uncertainties associated with using compact objects as cosmological probes.
ii Acknowledgments
I would like to thank all my collaborators, without whom the research contained in this thesis and in the more complete publication list would not be done. Their names have been enumerated in the publication author lists and some of them will also be mentioned later.
I would like to thank Prof. Chris Hirata, for being a great advisor and an incred- ible inspiring source for me. His confidence encourages me when he says “there is always a way” to a difficult problem. His broad knowledge and deep understanding of astronomy and physics constantly influences the way I think, and stimulates my passion for learning and doing research. Chris has always been caring, humble, pa- tient, excited about questions, and was also very helpful and supportive during my postdoc search. I would not have gone this far without his inspiration and guidance.
I would like to thank Prof. Todd Thompson for co-advising me for the last two years. He has introduced me into a broader area of astrophysics and shown me a very different way of thinking and solving problems in astronomy. His amazingly fast order-of-magnitude estimations often strike me and encourage me to build my sense of numbers and physical pictures. Working on quadruple system dynamics with him and Chris has also fulfilled my dream of studying the N body problem since I − was young. Todd was also very helpful and supportive during my job season, and
iii has shared many useful tips to improve my presentation and general communication skills, which I would cherish and benefit from for my entire career.
Studying in the physics department of Ohio State University has been full of joy for me, and one of the major sources is the encouraging and interactive environment of
CCAPP, thanks to everyone here! Specifically, I wish to thank Prof. Annika Peter for providing me good opportunities of giving talks and giving me feedbacks on them, and for putting great effort on working with the OSC to provide us invaluable computing resources, without which a large portion of my research would be impossible. I wish to thank Prof. John Beacom for helping me with my talks and papers, providing me lots of research and career advice, and managing the CCAPP. Michael Troxel has always been super patient and helpful when I have technical questions, and was also very helpful during my postdoc search. I have enjoyed working with Jonathan Blazek and Joe McEwen, and I have always missed those sunny weekend afternoons when we worked together on FAST-PT in the coffee shop in downtown. In addition, I want to thank Joe for helping my English, discussing physics with me, and introducing me Fortran. I want to thank Daniel Martens and Paulo Montero for being good friends, travel buddies, office mates and collaborators. I would like to thank Shirley
Li for helping my presentations and job applications, and for being a very good friend. I also want to thank Niall MacCrann, Ashley Ross, Heidi Wu, Ami Choi,
Tuguldur Sukhbold, Ben Wibking, Ben Buckman, etc., for their help in my research, and all other current and previous CCAPP postdocs and students for their help and friendship during my five-year OSU life.
I would like to thank many friends I have made at OSU. Kengyuan Meng, Weifeng
Ji, Dan Gao, I feel lucky to have you as my roommates for years, and have enjoyed
iv all the trips we have taken and all the good food we have tried together. I want to thank Bowen Shi and Fuyan Lu for all the interesting discussions we have had during lunches and dinners, and the road trips we have taken.
There are many friends outside OSU that have helped me a lot, including Yacine
Ali-ha¨ımoud, Elisabeth Krause, Vera Gluscevic, Tejaswi Venumadhav, Antonija Ok- lopˇci´c,He Chen. I really appreciate their help during my visits and the job season, and have enjoyed collaborating with many of them.
I must also thank many people I met during my undergrad that have led me to this route. I would like to thank Yujing Qin and Prof. Yanghui He for encouraging me to do research in cosmology. I would also like to thank Profs. Luca Amendola and
Xinhe Meng and their group members, who introduced me into the subject, showed me how research was done, and made me realize this was what I wanted to work on.
I couldn’t be more grateful to my parents Kaihe Fang, Baozhen Fang, and my surviving grandma Xiuhua Wang. One day in early 1997, my dad decided to buy all the six issues of Amateur Astronomy from 1996, a giant Encyclopedia of Astron- omy, and a 50mm refractor, which became my favorite toys and my starting point of learning cosmology and astrophysics. I was intrigued by the striking pictures in the astronomy books and amazed by astronomers’ abilities to calculate properties of stars and galaxies, which drove me to learn maths and physics. During the same year there was a partial solar eclipse in the area, which attracted me more into the space. I found more astronomy books months after witnessing the “longest total lunar eclipse in the 21st century” by chance in the summer of 2000, and had regular access to the Internet a few years later before my sixth grade, which really opened my eyes.
Although it was very difficult to find astronomy resources in a small inland city in
v China in the 1990s, I feel very lucky to have the chance to learn about astronomy, witness some magnificent astronomical events, and have great parents that fully sup- ported my hobbies and watched stars and meteor showers with me. I thank them for making my childhood full of fond memories, and for setting up all the necessary initial conditions for my exciting journey to studying the Universe.
During my PhD, I have been supported by the Department of Physics, the Simons
Foundation and the NSF 1313252 grant. Many computations in our work were run on the CCAPP condo of the Ruby Cluster at the Ohio Supercomputer Center.
vi Vita
2013 ...... B.S. Nankai University
2015 ...... M.S. The Ohio State University
Publications
Research Publications
“FAST-PT: a Novel Algorithm to Calculate Convolution Integrals in Cosmological Perturbation Theory” J. McEwen, X. Fang, C. Hirata, J. Blazek Journal of Cosmology and Astroparticle Physics, Volume 2016, Issue 09, 015 (2016)
“A New Probe of Magnetic Fields in the Pre-reionization Epoch: II. Detectability” V. Gluscevic, T. Venumadhav, X. Fang, C. Hirata, A. Oklopˇci´c, A. Mishra Physical Review D, Volume 95, Issue 8, 083011 (2017)
“FAST-PT II: an Algorithm to Calculate Convolution Integrals of General Tensor Quantities in Cosmological Perturbation Theory” X. Fang, J. Blazek, J. McEwen, C. Hirata Journal of Cosmology and Astroparticle Physics, Volume 2017, Issue 02, 030 (2017)
“Beyond Linear Galaxy Alignments” J. Blazek, N. MacCrann, M. Troxel, X. Fang Submitted to Physical Review D
“Dynamics of Quadruple Systems Composed of Two Binaries: Stars, White Dwarfs, and Implications for Ia Supernovae” X. Fang, T. Thompson, C. Hirata
vii Monthly Notices of the Royal Astronomical Society, Volume 476, Issue 3, Pages 4234 (2018)
“A Radial Measurement of the Galaxy Tidal Alignment Magnitude with BOSS Data” D. Martens, C. Hirata C, A. Ross, X. Fang Accepted in Monthly Notices of the Royal Astronomical Society
“Effects of [NII] and Hα Line Blending on the WFIRST Galaxy Redshift Survey” D. Martens, X. Fang, M. Troxel, J. DeRose, A. Ross, C. Hirata, R. Wechsler, Y. Wang Submitted to Monthly Notices of the Royal Astronomical Society
Fields of Study
Major Field: Physics
viii Table of Contents
Page
Abstract...... ii
Acknowledgments ...... iii
Vita ...... vii
ListofTables...... xiii
List of Figures ...... xv
1. Introduction...... 1
1.1 A Cosmic View ...... 1 1.2 Thesis Overview ...... 3 1.3 Cosmology Basics ...... 8 1.3.1 Smooth Universe ...... 8 1.3.2 Linear Perturbations ...... 11 1.3.3 Nonlinear Perturbations ...... 13 1.4 Observational Probes ...... 15 1.4.1 Cosmic Microwave Background ...... 15 1.4.2 Galaxy Clustering ...... 18 1.4.3 Weak Gravitational Lensing ...... 20 1.4.4 Type Ia Supernovae (SNe Ia) ...... 21 1.4.5 StandardSirens...... 22 1.5 Compact Objects and Their Dynamics ...... 23 1.5.1 White Dwarfs, Neutron Stars and Black Holes ...... 23 1.5.2 Concepts in Orbital Dynamics ...... 25
ix 2. FAST-PT I: a novel algorithm to calculate convolution integrals of scalar quantities in cosmological perturbation theory ...... 26
2.1 Introduction ...... 27 2.2 Method ...... 31
2.2.1 P22(k) type Convolution Integrals ...... 33 2.2.2 P13(k) type Convolution Integrals ...... 41 2.2.3 Regularization ...... 42 2.3 Performance ...... 45 2.3.1 1-loop Results ...... 45 2.3.2 Renormalization Group Flow ...... 49 2.4 Summary ...... 51
3. FAST-PT II: an algorithm to calculate convolution integrals of general tensor quantities in cosmological perturbation theory ...... 56
3.1 Introduction ...... 57 3.2 Method ...... 60 3.2.1 Transformation To 1D Integrals ...... 60 3.2.2 Algorithm...... 64 3.2.3 Removing Possible Divergences ...... 68 3.3 Applications ...... 70 3.3.1 Quadratic Intrinsic Alignments Model ...... 72 3.3.2 Ostriker-Vishniac Effect ...... 76 3.3.3 Kinetic polarization of the CMB ...... 79 3.3.4 Redshift Space Distortions ...... 83 3.4 Summary ...... 91
4. Dynamics of Quadruple Systems Composed of Two Binaries: Stars, White Dwarfs, and Implications for Ia Supernovae ...... 95
4.1 Introduction ...... 96 4.2 Secular Theory ...... 101 4.2.1 Newtonian gravity and quadrupole order interactions . . . . 103 4.2.2 Octupole order interactions ...... 107 4.2.3 First-order Post-Newtonian (1PN) corrections ...... 109 4.2.4 Tidal precession ...... 110 4.2.5 Gravitational wave dissipation ...... 110 4.2.6 Tidal dissipation ...... 111 4.2.7 Spin...... 113 4.2.8 Non-secular effects ...... 114
x 4.3 Secular Evolution of Quadruple versus Triple Systems ...... 115 4.3.1 Examples ...... 116 4.3.2 Enhanced high-e fraction ...... 122 4.3.3 Growing fraction over time ...... 125 4.3.4 Orbital size dependence ...... 126 4.3.5 Mass ratio dependence ...... 129 4.3.6 Possible “safe” regions ...... 131 4.3.7 Quadruple systems of main sequence stars ...... 132 4.4 Implications for WD-WD Mergers ...... 135 4.4.1 Merger rate ...... 137 4.4.2 Understanding the results ...... 138 4.4.3 Classification of orbital shrinking ...... 142 4.5 Nonsecular effects: evection ...... 145 4.6 Discussion and Conclusion ...... 150 4.6.1 Stellar quadruples ...... 152 4.6.2 WD-WD binaries and Type Ia supernovae ...... 154 4.6.3 Future directions and outlook ...... 156
5. Population of Eccentric Black Hole Systems ...... 159
5.1 Introduction ...... 159 5.2 Circular vs. Eccentric ...... 161 5.2.1 Equilibrium Distributions ...... 161 5.2.2 Eccentric System Population ...... 165 5.3 Discussion and Conclusion ...... 173 5.3.1 Detectability ...... 174 5.3.2 Complexities ...... 175
6. Conclusion...... 177
6.1 Summary of Thesis ...... 177 6.2 FutureWork ...... 178 6.2.1 Efficient Tools for Computing Nonlinear Effects ...... 179 6.2.2 Quadruple Systems as Candidate Type Ia Supernova Progen- itors...... 180 6.2.3 Progenitors of Stellar-Mass Binary Black Hole Mergers . . . 181
Appendices 182
A. FAST-PT I...... 182
xi A.1 Mathematical Identities ...... 182 A.2 Γ-function identities and evaluations ...... 183 A.3 Mitigation of Edge Effects ...... 184 A.4 RG-flow Integration ...... 185
B. FAST-PT II ...... 187
B.1 Mathematical Identities ...... 187 B.1.1 Spherical Harmonics and Legendre Polynomials ...... 187 B.1.2 Wigner 3j and 6j Symbols ...... 188 B.2 Derivations ...... 190 B.2.1 Derivation of Eq. (3.4) ...... 190 B.2.2 Derivation of Eq. (3.11) and (3.12) ...... 191 B.3 Proof of Feasibility of Series Expansion ...... 191
C. Hierarchical Quadruple Systems ...... 194
C.1 Coefficients in Octupole Order Hamiltonian ...... 194 C.2 Code Description and Tests ...... 195 C.2.1 Convergence with step size ...... 196 C.2.2 Conservation of energy and angular momentum ...... 196 C.2.3 Kozai constant ...... 199 C.2.4 Comparison with few-body calculations ...... 202 C.2.5 Coordinate independence test ...... 205 C.3 “Precession oscillation” phase ...... 205 C.4 Short-term eccentricity changes due to evection ...... 210 C.4.1 Upper Bound ...... 218 C.4.2 True evection envelope (TEE) ...... 221
Bibliography ...... 223
xii List of Tables
Table Page
1.1 The cosmological paramters and their current values. The value of
TCMB is cited from Fixsen (2009), and the rest values are from Planck Collaboration et al. (2016b)...... 3
3.1 The coefficient of each term in the Legendre polynomial expansion of 2 2 hE and hB kernels (without the factor of 2 in front of the integral Eq. 3.28). Due to symmetry, we need only keep terms with `1 `2 (multiplying the value by two where relevant)...... ≥ . . . 76
3.2 The coefficient of each term in the Legendre polynomial expansion of
kernels of Bi(k). α = β = 1 for all the terms. Due to symmetry, we − need only keep terms with `1 `2 (multiplying the value by two where relevant). Empty entries are equal≥ to the previous row...... 88
3.3 The coefficient of each term in the Legendre polynomial expansion I of kernels of Ai(k). The empty entries mean that they equal to the previous row...... 89
3.4 The coefficient of each term in the Legendre polynomial expansion of II kernels of Ai (k)...... 92
4.1 The initial orbital elements of a hierarchical triple system consisting of
three 1M stars, discussed in 4.3.1. The initial inclination between the inner and outer orbit is 60◦§...... 117
4.2 The initial orbital elements of a hierarchical quadruple system, dis- cussed in 4.3.1. The inner orbit A consists of a solar-mass star and a Jupiter-mass§ planet and the orbit B consists of a pair of solar-mass stars...... 119
xiii 4.3 The initial orbital elements of a “4-star” quadruple system, discussed in 4.3.1. Both of the inner orbits consist of a pair of solar-mass stars. 120 § 4.4 The initial orbital configurations of the “4-star” hierarchical quadruple systems, discussed in 4.3.2. The orbital sizes and shapes are fixed, while their orientations§ are randomly sampled. Because the physics is independent of the orientation of the coordinate system, we can reduce the degree-of-freedom of the system by fixing the initial orientation of one of the orbits (here the mutual orbit)...... 124
−3 A.1 Stable RK4 runs for kmin = 10 and kmax = 1 and ∆λ = 0.1...... 186
−3 A.2 Stable STS runs for kmin = 10 and kmax = 10. Results were obtained using STS parameters: µ = 0.1, ∆λCFL = 0.001,Ns = 10...... 186
C.1 The initial orbital elements of the example “[WD-WD]-[Star-Star]” sys- tem for energy and angular momentum conservation tests in Appendix C.2.2...... 199
C.2 The initial orbital elements of the triple systems for the Kozai constant test in Appendix C.2.3...... 202
C.3 The initial orbital elements of the triple systems for comparison with the REBOUND few-body simulation, discussed in Appendix C.2.4. We test for different outer orbit eccentricities...... 204
C.4 The initial orbital elements of the example “[Star-Planet]-[Star-Star]” system for the coordinate-independence test in Appendix C.2.5. . . . 205
xiv List of Figures
Figure Page
2.1 Power spectra in the log-periodic universe. Top panel shows the win- dowed linear power spectrum biased by k−ν (we choose ν = 2), with grey lines indicating the “satellite” power spectra, i.e. the contribution− to the total power spectrum that arises due to the periodic assumption in a Fourier transform. The middle panel plots ∆2(k) = k3P (k)/(2π2), within the periodic universe. This is the quantity that sources the density variance σ2 = R d ln k∆2(k). The bottom panel plots the con- R 2 2 tribution to the displacement variance σξ = d ln k∆ (k)/k ...... 37
2.2 FAST-PT 1-loop power spectrum results versus those computed using a conventional fixed-grid method. The top panel shows FAST-PT results for P22(k)+P13(k) (the dashed line is for negative values). The bottom panel plots the ratio between FAST-PT and the conventional method. . 47
2.3 Estimate of FAST-PT execution time to number of grid points scaling. The left panel plots the average one-loop evaluation time, after initial- ization of the FAST-PT class. The right panel plots the average time required for initialization of FAST-PT class for 1500 runs. For a sample of grid points, the error is computed by taking the standard deviation of1500runs...... 48
−1 2.4 FAST-PT Renormalization group results for kmax = 5, 50 hMpc . Left panel shows Renormalization group results and SPT{ results} compared to the linear power spectrum (see legend in right panel). Right panel
shows neff = d log P/d log k for Renormalization group, SPT, and linear theory...... 52
2.5 Renormalization group results compared to standard 1-loop calcula- tions and those taken from the Coyote Universe. Left panel plots power spectra. A plateau at high-k develops due to boundary con-
ditions. Right panel shows neff (k) = d log P/d log k...... 53
xv 3.1 The convergence region of the bias indices ν1, ν2 is indicated by the shaded region...... 71
(EE,BB) 3.2 The FAST-PT result for the intrinsic alignment integrals PIA,quad (k) in Eq. (3.28) (upper panel) and the fractional difference compared to the conventional method (lower panel)...... 77
3.3 The FAST-PT result for the Ostriker-Vishniac effect integral S(k) in Eq. (3.36) (upper panel) and the fractional difference compared to the conventional method (lower panel)...... 80
3.4 The FAST-PT results for the kinectic CMB polarization integrals P (m)(k) in Eq. (3.46) (upper panels) and the fractional difference compared to the conventional method (lower panels)...... 84
3.5 The FAST-PT result for the redshift space distortion nonlinear correc- tions A(k, µn) + B(k, µn) in the TNS model, Eq. (3.48) (upper panels) and the fractional difference compared to the conventional method re- sult (lower panels)...... 93
4.1 Illustration of a “2+2” hierarchical quadruple star system. Masses m0 and m1 form “inner binary A” with separation r1, m2 and m3 form “inner binary B” with separation r2, and their centres of masses orbit each other in the“mutual”orbit with separation r. We focus on systems
where r1, r2 r...... 102 4.2 Illustration of the orbital elements. The orbital plane intersects the reference planex ˆ yˆ along the line of nodes with the direction of ascending node denoted− by Ω.ˆ h defines the argument of ascending node with respect to the reference plane, and g defines the argument of periastron in the orbital plane. The angle between the orbital angular momentum G and the z axis defines the inclination i, which is also the angle between the reference− plane and the orbital plane...... 104
4.3 The evolution of the triple system discussed in 4.3.1. The upper panel shows the eccentricities of the inner and outer§ orbits, while the lower panel shows the inclination between the inner and outer orbits. The system exhibits the regular LK oscillation. The initial orbital elements of this example system are listed in Table 4.1...... 118
xvi 4.4 The evolution of the quadruple system ([Star-Planet]-[Star-Star]) dis- cussed in 4.3.1. The upper panel shows the eccentricities of the inner and outer§ orbits, while the lower panel shows the inclinations between the two inner orbits and the outer orbit. The inner orbit B exhibits the regular LK oscillation, while orbit A evolves irregularly. The initial orbital elements of this example system are listed in Table 4.2. . . . . 120
4.5 The evolution of the “4-star” system discussed in 4.3.1. The upper panel shows the eccentricities of the inner and outer§ orbits, while the lower panel shows the inclinations between the two inner orbits and the outer orbit. Both of the inner orbits evolve irregularly, and one of them reaches very high eccentricities that its equivalent triple counterpart system will not be able to reach with the same set of initial orbital elements. The initial orbital elements of this example system are listed in Table 4.3. Note that the high eccentricity shown in the plot is significantly more than sufficient for the stars to collide...... 121
4.6 The initial mutual inclination distributions of systems that reach high-e before 10 Gyr, calculated from the 105 randomly oriented “4-star” sys- tems discussed in 4.3.2, whose initial orbital configurations are listed in Table 4.4. The§ left panel shows systems whose inner orbits A reach high-e, while the right panel shows systems whose inner orbits B reach high-e. The higher density of the colour in each panel represents the higher fraction of high-e systems, and it is normalized to the total high-e fraction of each inner orbit. Between the two green dashed lines in the left panel shows the high-e systems from the equivalent triple case, where the inner orbit B is replaced by a single star with mass
mB = m2 + m3 = 2M ...... 125
4.7 The growing cumulative fractions of high-e events from the 105 ran- domly oriented “4-star” systems and their equivalent triple systems described in 4.3.2. At quadrupole order, the high-e fraction of triples stops growing§ after the Kozai timescale ( 6Myr), but for quadruples, the fraction keeps growing...... ∼ ...... 127
4.8 The high-e fractions from the inner orbit A, inner orbit B and the total fraction vary as functions of the semi-major axis of the inner orbit
B, described in 4.3.4. a2 is evenly sampled from 8.5AU to 22.0AU, § 4 with a stepsize 0.1AU. For each sampled a2, we run 10 systems up to t = 5 Gyr. The rest of initial orbital elements are listed in Table 4.4. . 130
xvii 4.9 The high-e fractions of 105 randomly oriented “4-star” systems with mass ratios 1 (left), 3 (center), 100 (right) in the inner orbits A, dis- cussed in 4.3.5. The other initial orbital elements are listed in Table 4.4. Their§ equivalent triple cases (i.e., replacing the inner binary B
with a single 2M star) are plotted for comparison. The high-e fraction enhancement for quadruples is remarkably robust against variations in
m0/m1...... 131
4.10 The cumulative fractions of 104 randomly oriented “4-star” systems whose inner orbits A and B reach high-e, shown in blue and orange solid lines, respectively. The rest of systems are “safe” and are shown in the green line. The initial orbital configurations are listed in Table 4.4, and each system runs up to 1013 years, as discussed in 4.3.6. . . 133 § 4.11 The “safe” regions for the 104 randomly oriented “4-star” systems with parameters from Table 4.4 running up to 1013 years, as discussed in 4.3.6. All the systems that have never reached high-e are initially §coplanar, and the “safe” corners are larger for those systems whose two inner orbits are in the same direction. The density of the colour
represents the fraction of“safe”systems in that region of (cos iA, cos iB)- space, and it is normalized to the total “safe” fraction...... 134
4.12 The high-e cumulative fractions in 105 quadruple “4-star” systems ver- sus in their “equivalent triple” systems (orange solid line) and triples with solar-mass tertiary (green solid line), as discussed in 4.3.7. The dashed and dot-dashed blue lines are high-e fractions from§ inner orbit A and B, respectively, which are almost the same because they are sampled from the same distributions, while the solid blue line is the sum of them, i.e., total fraction...... 136
xviii 4.13 The WD merger cumulative fractions (upper panels) and rates (lower panels) in 105 quadruple systems (i.e., [WD-WD]-[Star-Star]) (blue solid lines) versus in their “equivalent triple” systems (orange solid lines) and triples with solar-mass tertiary (green solid lines), as dis- cussed in 4.4.1. The left panel shows results from equal-mass WDs § (both with 0.7M ), while the right panel shows results from unequal- mass WDs (0.8 + 0.6M ). The stellar masses in quadruple systems are both 1M and their “equivalent triple” systems have tertiary masses 2M . The blue dashed lines are the fractions of Channel (III) merg- ˙ ers in quadruple systems. The rates Γf f are obtained by fitting f ˜ ˜2 ˜3≡ ˜4 ˜ to polynomial f = A + Bt + Ct + Dt + Et , where t log10 t and A, B, C, D, E are fitting parameters. Note that we show the≡ early-time rates only for dynamical interest. Most of WDs form at late times depending on their progenitor masses, so we should only focus on the rateatlatetimes...... 139
4.14 The timescales versus the periastron of the inner orbit A, rp1, for a system with a = 2000AU, a1 = 10AU. All the timescales except the tidal dissipation are calculated using Eqs.(4.36-4.42), while the tidal dissipation timescale is calculated from Giersz (1986) as we use in our code. The shaded region is Channel (II) region...... 143
4.15 Classification of orbital shrinking WD mergers...... 146
4.16 An illustration of evection. The inner orbit (in blue) and the mutual or- bit (in orange) are both in the x y plane, with their angular momenta − along the +z direction. The perturber mB at the position shown in the figure exerts a tidal torque on the inner orbit, as shown by the yellow
arrows, which decreases the inner eccentricity e1. As the perturber moves around, the tidal torque will change its direction according to
the quadrant the perturber is in, and the change of e1 is shown at the four corners. During one period of the mutual orbit, the eccentricity
e1 goes up and down twice...... 148
xix 4.17 The merger fractions from 105 random [WD-WD]-[Star-Star] systems (blue solid lines) and their equivalent triple cases (orange solid lines), with evection included, as described in 4.5. Results for triples with solar-mass tertiary are shown with green§ solid lines. With evection, the merger fractions are enhanced for both quadruple systems and triples, but the fraction from quadruples remains much larger than that from triples. Mergers from Channel (I) (blue dashed lines) now dominates over Channel (III), and Channel (II) becomes negligible. The runs for this figure is equivalent to those for Fig. 4.13 except for including evection...... 151
5.1 Comparison between eccentric and circular for a fixed initial configu-
ration a0 = 100 au, rp0 = 0.02au...... 164
5.2 The fraction of mergers grows with time, approximately as t2/7.... 166
5.3 The initial semi-major axis a distributions of the mergers and non- mergers in the 107 systems ...... 167
5.4 The initial periastron rp distributions of the mergers and non-mergers in the 107 systems ...... 168
5.5 The cumulative eccentricity distribution of the 2727099 mergers in 7 the 10 systems when they reach peak frequency fp = 50 Hz. Only about 0.15% of systems have final eccentricities greater than 0.001, 0.1% greater than 0.01, and less than 0.01% greater than 0.1...... 169
5.6 Upper: The peak frequency distributions of eccentric BH binaries formed in hierarchical triple systems with orbital period P < 1, 10, 100, 1000 days, versus the distribution of circular BH binaries. We assume all the LIGO mergers are formed from this eccentric channel or the circular channel, and take the merger rate as Γ = 5 10−6yr−1 per Milky Way- × size galaxy. Dashed lines show their corresponding merger time tmerge. Lower: The ratio Rf of the distributions of eccentric systems to circu- lar systems. A significant enhancement is seen in frequency range 0.1 to 1 mHz, in which the number of systems with orbital periods of days to months is expected to be of order 102 103, as shown in the upper panel. Note that the distributions at lower− frequencies are subject to modifications due to the actual criteria for decoupling the inner orbit from the outer...... 171
xx C.1 Convergence test for 105 randomly oriented “4-star” systems discussed in 4.3.3. Only quadrupole order effect is turned on. The stepsize prefactor§ of Eq. (C.12) is chosen to be 0.1, 0.05 (default setting), 0.01, 0.005, respectively, and the high-e fractions converge very well. . 197
C.2 Convergence test for 105 randomly oriented “[WD-WD]-[Star-Star]”
systems with equal-mass (0.7M ) WDs, discussed in 4.4. Only quadrupole order effect is turned on. The stepsize prefactor of§ Eq. (C.12) is cho- sen to be 0.1, 0.05 (default setting), 0.01, 0.005, respectively, and the high-e fractions converge very well...... 198
C.3 Angular momentum conservation test for a“[WD-WD]-[Star-Star]”sys- tem with its initial orbital parameters listed in Table C.1. The plot shows the deviations of 3 components of the total angular momentum initial from their initial values, i.e., (Gtot,(x,y,z) Gtot,(x,y,z))/Gtot, up to 10Gyr. Note that the RK4 integrator exactly conserves− the total z-angular mo- mentum, but not the x or y components...... 200
C.4 Energy conservation test for a “[WD-WD]-[Star-Star]” system with its initial orbital parameters listed in Table C.1. A conservative system preserves its Hamiltonian, and this plot shows the fractional devia- tion of the perturbation Hamiltonian (only including quadrupole and octupole order terms and the GR and tidal precession terms, but ex- cluding the Kepler parts) from its initial value. The errors are small and remain bounded up to 10Gyr...... 201
C.5 The fractional deviations of Kozai constant at quadrupole order for triples with initial orbital parameters listed in Table C.2 and discussed in Appendix C.2.3. The constant is preserved at 10−8 level for all inner
binary mass ratios m1/m0...... 203
C.6 The eccentricities of the inner orbits at their local maximum. The three triple systems are described in Appendix C.2.4 and Table C.3 and their initial parameters only differ in the outer eccentricities (e = 0(left), 0.2 (middle), and 0.6 (right)). The blue lines are results from REBOUND few- body simulations, while the orange lines are produced by our secular code. The green and red lines are the estimated “upper bound” and the “true evection envelope”, calculated based on the secular results. The details of evection calculations are discussed in 4.5 and Appendix C.4. 204 §
xxi C.7 The eccentricity evolution of the [Star-Planet]-[Star-Star] quadruple system in Table C.4. In the upper panel, solid lines and dashed lines are calculated from the original and the rotated coordinate system, respectively. The lower panel shows the fractional difference between the results from the two coordinate systems. The Star-Planet binary starts to deviate earlier than the Star-Star binary because it is more chaotic, as we have discussed in 4.3.1, although the deviations are very tiny at least in the first 150Myr.§ ...... 206
C.8 An example quadruple system with a WD binary (0.7 + 0.7M ) and a stellar binary (1+1M ). We include the Newtonian secular effects up to octupole order, and the 1PN and tidal precession for both inner orbits, as well as the 2.5PN and tidal dissipation for the WD binary. The upper panel shows the eccentricities of both inner and mutual orbits evolve with time and the lower panel shows the inclinations between
two inner orbits and the mutual orbit. At the end the eccentricity e1 shows the “precession oscillation” phase while circularizing...... 207
C.9 Zoom-in of the final stage of the system shown in Figure C.8. Here we show the evolution of eccentricities, inclinations, argument of perias-
tron of WD binary (g1), semi-major axis of WD binary (a1), and the timescales relevant to the WD binary, respectively...... 208
(simp) C.10 The equal- contours of the example system in Figure C.8 as its H eccentricity e1 approaches the maximal eccentricity. The orange dot- dashed lines are the separatrix between the hyperbolic trajectories and the trapped elliptic trajectories...... 211
C.11 The phase diagram of the example system in Figure C.8 between t = 3612Myr and t = 3650Myr. Before and after the first eccentricity peak, the separatrix lines move from the blue dashed lines to the orange dashed lines, and the second eccentricity peak moves the separatrix lines further out to the green dashed lines. Meanwhile, the trajectory transits from the hyperbolic to the trapped elliptic...... 212
xxii C.12 A sketch of the system during its “precession oscillation” phase. The inner orbit (in blue) is in the x y plane, with its angular momentum along the +z direction. The mutual− orbit is in the x z orbit and exerts an average tidal torque on the inner orbit, as− shown by the yellow arrows. The torque has different directions in each quadrant of
the x y plane, hence increasing or decreasing e1 depending on which quadrant− the apastron of the inner orbit is in. As the inner orbit swings around the z-axis for one cycle, the eccentricity goes up and down twice.213
xxiii Chapter 1: Introduction
1.1 A Cosmic View
On a clear night, looking upon the sky, we are literally watching the cosmic history due to the finite nature of the speed of light. The moving artificial satellites reflect sunlight roughly milliseconds to deciseconds ago. The Moon is seen as it was slightly over 1 second ago. Looking farther, we see all the planets are within a few hours away by light. All the stars visible to our naked eyes are in our Galaxy, except for some very rare supernovae in our satellite galaxies that might happen in our lifetime, so they are a few years to at most 105 years away by light. With telescopes, we see galaxies and clusters of galaxies, distributed from million years away to more than
13 billion years away when the Universe was less than a billion year old. Beyond that we hardly see anything. However, tuning a big antenna around 100 GHz, one would likely receive some constant background noise from all directions, which was
first discovered by Arno Penzias and Robert Wilson in 1964 at a lower frequency (4.08
GHz), and identified as the thermal radiation (last) scattered off by baryons when the
Universe was only 0.3 million years old (Penzias & Wilson, 1965). It is called “cosmic microwave background” (CMB), and it is the earliest signal we have detected so far.
1 Modern physical cosmology, coeval with the general relativity (GR), has made
great progress in understanding the components and evolution of the Universe with
laws of physics. The ΛCDM model, referred to as the standard model of Big Bang
cosmology, assumes the validity of GR on cosmological scales and that the Universe is
composed of normal matter (baryons) + cold dark matter (CDM) + dark energy with
constant physical energy density (ρΛ = Λ/8πG, where Λ is the cosmological constant).
It uses 7 cosmological parameters to successfully account for current observations, including
The expansion history of the Universe; •
The existence and properties of CMB; •
Light element abundances (including H, D, He, Li); •
The growth of large-scale structure (LSS). • We list the parameters and their current values in Table 1.1. The current CMB mean temperature was very well measured by the COBE satellite (Smoot et al., 1992) and was later recalibrated by data from WMAP and other experiments (Fixsen, 2009). It is so precise that current cosmological analyses usually fix it as a constant.
An easy way to tell the cosmic history is to divide it into different eras according to the most dominant energy components. We have been in the dark energy dominated era since 4 billion years ago, in which the Universe experiences an accelerated ∼ expansion. Before that, the Universe had been in the matter-dominated era since about 13.7 billion years ago, in which the Universe decelerated its expansion, the photons decoupled from baryons and traveled freely in space, and stars and galax- ies were formed. Before that, the Unverse had been dominated by radiation since
2 Symbol Meaning Value TCMB CMB mean temperature today 2.72548 0.00057 K ± 9 t0 Age of the universe 13.799 0.021 10 years 2 ± × Ωmh Physical matter density parameter 0.1188 0.0010 2 ± Ωbh Physical baryon density parameter 0.02230 0.00014 ± ns Scalar spectral index 0.9667 0.0040 2 +0±.088 −9 ∆R Curvature fluctuation amplitude 2.441 −0.092 10 τ Optical depth of reionization 0.066 0.012× ±
Table 1.1: The cosmological paramters and their current values. The value of TCMB is cited from Fixsen (2009), and the rest values are from Planck Collaboration et al. (2016b).
it was roughly 10−35 second old, during which CDM was expected to be produced, and later nucleosynthesis occurred. Before the radiation-dominated era, a rapid ex- pansion called inflation may have happened, which not only made the Universe very homogeneous and isotropic, but also rescaled quantum fluctuations in the very early
Universe to cosmological scales, seeding the structure formation at much later time.
Despite the success of the ΛCDM model, many big questions remain puzzling:
What is the nature of dark matter? What is the nature of dark energy? Is GR correct on cosmological scales? How to probe inflation observationally? All of these questions involve precise measurements of cosmological parameters, which have been the focus of most of observational cosmology today.
1.2 Thesis Overview
In this thesis, I will present some of the projects I have worked on during my PhD.
My effort to improve current and future cosmology analyses falls into two categories.
3 On the one hand, ongoing and upcoming LSS surveys provide the opportunity to mea- sure the properties of the Universe to unprecedented precision. I have helped develop accurate and extremely efficient tools for analyzing the next generation cosmology datasets. On the other hand, compact objects, responsible for some of the most vio- lent phenomena observable on cosmological scales, play a unique role in probing the expansion history of the Universe. Compact objects include white dwarfs (WDs), neutron stars (NSs) and black holes (BHs). Binary WD mergers may produce Type
Ia Supernovae (SNe Ia), serving as “standard candles” in modern cosmology, while bi- nary NS and BH mergers emit a characteristic gravitational wave (GW) signal (when they orbit and approach each other) that may serve as “standard sirens” in future cos- mology. However, the quest for a percent level determination of the expansion rate of the local Universe calls for an accurate characterization of the intrinsic variations of SNe Ia, while the progenitors are largely uncertain. Binary BHs detected in LIGO
(Abbott et al., 2009) have somewhat surprisingly large masses (tens of solar masses), posing questions about their formation channels: Are they astrophysical or primor- dial? Do they form by dynamical capture or by stellar evolution? I have worked on the progenitor problem of both SNe Ia and LIGO BH mergers.
In the rest of this chapter, I will provide some basics of cosmology and compact objects. In Section 1.3, I will give a very basic review of the the physics of smooth
Universe and its linear perturbations. They are accurate enough to describe the
CMB features we have observed. They also provide good estimates of the matter distribution on large scales. On smaller scales and at late times, density perturbations become large and nonlinear physics (arising from, e.g., the nonlinearity of gravity and astrophysical effects) comes into play. When the perturbations are still much smaller
4 than order unity, nonlinear perturbation theory is a good tool to study the evolution of the density field, which I will also briefly talk about in Section 1.3. In Section 1.4,
I will introduce several important probes for observational cosmology, among which compact objects have played crucial roles in mapping out the local expansion of the
Universe and revealed the existence of the dark energy. I will talk more about compact objects in Section 1.5, Since we are mostly interested in binary compact objects and their mergers, orbital dynamics will also be introduced.
In Chapter 2, I will review our first version of FAST-PT algorithm, an extremely efficient tool for calculating convolution integrals in nonlinear perturbation theories.
Upcoming cosmology surveys aim at precise measurement of cosmological parame- ters in order to confirm or rule out the standard cosmological model. To do this, models of the evolution of the Universe are compared with observations of millions of galaxies and other objects. To achieve the required precision, modeling nonlinear effects becomes crucial. However, perturbation theory, one of the most powerful tools for modeling nonlinear effects, is usually extremely computationally intensive, hence cannot be incorporated into Markov chain Monte Carlo analysis. To fully utilize the upcoming data, we must be prepared with a set of tools that are both accurate and efficient in evaluating nonlinear effects, even at the stage of forecasting, and our FAST-
PT algorithm is such a neccessary and timely tool! The first version deals with power spectra (we will introduce in Section 1.3) of scalar quantities, such as matter and galaxy density fields. We will show how the computation can be sped up by 4 to 6 orders of magnitude without any sacrifice in accuracy. After its release, it has immedi- ately been incorporated into the analysis pipeline of the ongoing Dark Energy Survey
(e.g., Krause et al., 2017), enabling state-of-the-art dark energy measurements.
5 In Chapter 3, I will present our work on extending the FAST-PT algorithm to general tensor quantities, which greatly expands its applications. Examples include calculations of galaxy intrinsic alignments (IA, a type of systematics in the weak lensing technique), CMB secondary effects, and redshift space distortions (RSD, will be introduced in Section 1.3). This extended FAST-PT is the current public version1.
In Chapter 4, I will present our work on hierarchical quadruple systems as a candidate for producing SNe Ia. One of the most popular hypotheses for SN Ia pro- genitors is the so called double degenerate scenerio, which involves two WDs orbiting each other, gradually dissipating their orbital energy through GW emission and then merge. The major problem is that the GW dissipation rate is low. If two WDs want to merge within Hubble time, they need to have initial orbital period less than 0.3 day, corresponding to an initial separation about the solar diameter, which is too small for their progenitor stars. This immediately raises a question: how to make compact
WD binaries? Several ways to help make WD binaries compact include the common envelope (CE) evolution and the triple system dynamics (e.g., Thompson, 2011; Katz
& Dong, 2012). The CE evolution transport WD orbital energy to the surrounding
CE. As the CE expands out, the binary orbit can shrink very rapidly. However, the physics of CE physics is not well understood, and the short-delay-time merger rate is also underpredicted from this channel (e.g., Ruiter et al., 2009). The triple system scenario assumes the WD binary has a distant companion star (i.e., a tertiary star).
The torque from the tertiary will exchange angular momentum with the inner binary on a timescale much longer than both the inner and the outer orbits. As a result, a tertiary initially at a high inclination angle will have a good chance to excite the inner
1FAST-PT public repository: https://github.com/JoeMcEwen/FAST-PT
6 orbit to high eccentricities. At high eccentricities, the binary WDs can get very close to each other and experience significant GR and tidal effects, causing rapid precession and energy dissipations which may trigger rapid WD mergers. The issues for this idea will be discussed in the introduction section of Chapter 4. Alternatively, we propose a quadruple system dynamics solution to the progenitor problem that will mitigate the issues presented by the “triple” scenario and provide a much enhanced merger rate that might account for a significant fraction of the observed SN Ia rate.
In Chapter 5, I will present our work on the implications of assuming highly eccentric binary BHs as the channel for producing BH mergers detected by LIGO. The
BH merger rate constrained by the LIGO detections, if steady, implies the existence of a population of binary BHs with different orbital separations on their way to merging.
The distribution of orbital elements of this population depends on the formation channel. Whatever the formation channel is, the binaries in the LIGO frequency bands have already well-circularized and all channels (via circular orbits or eccentric migrations) may hardly show any significant difference in the remaining eccentricities, although recent studies have shown that a few percent of binary BHs formed via two- body or three-body scattering in globular clusters may remain eccentricities above 0.1
(D’Orazio & Samsing, 2018). A good way to find out the progenitors is to look at lower
GW frequencies where the systems are still long before they would merge. Future space GW interferometers like LISA (Amaro-Seoane et al., 2017), sensitive to the mHz frequency bands, naturally serve this science goal. We investigated the possibility that the LIGO BH mergers are produced in hierarchical triple systems where the inner binary BHs are excited to high eccentricities due to the perturbations of the tertiary companion. We found that a much greater population of highly eccentric binaries
7 would exist from roughly 0.1 to 1 mHz frequencies than what would be if the systems
are all on circular orbits. A lot of those systems would have orbital periods of days
to months, emitting repeated GW pulses during the LISA mission time and allowing
possible detections of them in our Galaxy with high signal-to-noise ratios.
In Chapter 6, I will conclude the thesis with a short summary and envision some
follow-up work for each research direction.
1.3 Cosmology Basics
1.3.1 Smooth Universe
We can treat the Universe as a superposition of a smooth background and per-
turbations around that background. The smooth background is a homogeneous and
isotropic Universe, which is true on large scales and is called the cosmological prin-
ciple. In GR, this means the spacetime has a metric (i.e., the FRW metric, named
after Friedmann, Robertson and Walker) given by
ds2 = dt2 + a2(t)[dχ2 + f(χ)(dθ2 + sin2 θdφ2)] , (1.1) −
where a(t) is the scale factor, and f(χ) is a function dependent on the spatial curvature
K defined by −1 2 √ K sin ( Kχ)(K > 0) f(χ) = χ2 (K = 0) (1.2) ( K)−1 sinh2(√ Kχ)(K < 0) − − where K = 1, 0, 1 represent closed, flat, and open Universe, respectively. Since the − Universe has been flat for most of the cosmic history, we will only consider the K = 0
case from now on. The metric reduces to
ds2 = dt2 + a2(t)[dx2 + dy2 + dz2] . (1.3) −
8 The scale factor a(t) captures the global spatial expansion of the Universe, and for
today t = t0, a(t0) = 1. The coordinate system that expands with the Universe is
the comoving coordinate. The physical coordinate is the comoving one scaled by the
scale factor a.
In order to use GR to obtain the dynamics, we need to solve for the curvature
of the spacetime manifold. First, we derive the Christoffel symbols from the metric,
which is straightforward. After that, we can obtain the non-zero components of the
Ricci tensor and the Ricci scalar,
" 2# a¨ 2 a¨ a˙ R00 = 3 ,Rii = 2˙a + aa¨ , = 6 + , (1.4) − a R a a
where the index i runs over 1, 2, 3.
Having got the geometry part, we need the energy-momentum tensor to complete
the Einstein equations. The background Universe is assumed to contain homoge-
neously and isotropically distributed energy density. For a perfect isotropic fluid, we
have ρ 0 0 0 − µ 0 P 0 0 T = , (1.5) ν 0 0 P 0 0 0 0 P where ρ is the density and P is the pressure.
Now substituting the geometry part and the energy-momentum tensor into the
Einstein equations 1 Rµν gµν = 8πGTµν , (1.6) − 2 R we get the Friedmann equations
a˙ 2 8πG a¨ 4πG = ρ , = (ρ + 3P ) . (1.7) a 3 a − 3
The first equation comes from the 00 component of the Einstein equations.
9 The evolution of the energy-momentum tensor is governed by the conservation law:
µ T ν;µ = 0, and the time component is the only nontrivial one in the FRW universe due to the symmetry, which reduces to
∂ρ a˙ + 3 (ρ + P ) = 0 . (1.8) ∂t a
The only missing piece for fully solving the dynamics is the equation of state. For
−3 CDM, the pressure vanishes, and we have matter energy density ρm a . For ∝ relativistic species like photons, P = ρ/3, and we have radiation energy density
−4 ρr a . In the ΛCDM model, dark energy has constant energy density ρΛ, and the ∝ equation of state is wde P/ρ = 1. ≡ − Having established the theoretical basis of the smooth Universe, we will begin to introduce some very useful quantities that are closely related to measurements. There are 3 most frequently used quantities in cosmology: redshift z, Hubble parameter H, and comoving distance χ. The redshift z comes from measuring the spectra of distant galaxies. The expanding Universe leads to the fact that all the distant galaxies are receding from us in velocities that are given by a function of their distances (in homogeneous case). The recession causes their spectra lines to move towards the longer wavelengths, i.e., the redder direction, due to the Doppler effect, and the redshift is defined as the fractional difference between the observed wavelength and the emitted wavelength (in the rest frame),
λ z obs 1 . (1.9) ≡ λemit −
Since the photon energy also decays with the spatial expansion, we can relate the redshift to the scale factor as 1 + z = 1/a. Today corresponds to z = 0. CMB
corresponds to z 1100, and the reionization occurs at z 8. At low redshifts, ' ' 10 galaxies’ receding velocities are roughly proportional to their distances, which was
discovered by Edwin Hubble in 1929 (Hubble, 1929), and the linear coefficient is
called the Hubble constant H0 (equal to H at z = 0). H, as a function of the redshift, encodes the expansion rate of the Universe and is defined by H a/a˙ . Using the ≡ Friedmann equations, we can express the Hubble parameter as
H(z) p 3 4 = ΩΛ + Ωm(1 + z) + Ωr(1 + z) , (1.10) H0
where Ωx ρx/ρcr is the dimensionless density parameter of the component x, and ≡ 2 ρcr 3H /(8πG) is the critical energy density, equal to the total energy density ≡ 0
of the Universe today for the flat geometry. For H0 = 70 km/s/Mpc, the value is
−29 3 ρcr = 0.92 10 g/cm . The comoving distance of an object at redshift z = 1/a 1 × − is simply given by Z t0 cdt0 Z z dz0 χ(z) = 0 = c 0 . (1.11) t(a) a(t ) 0 H(z )
The other two commonly used distances are the angular diameter distance dA, defined
by the linear scaling between the angular size and the distance, and the luminosity
distance dL, defined by the inverse-square law of the flux. In a flat Universe, they are
related to the comoving distance simply by
dA = aχ , dL = χ/a . (1.12)
1.3.2 Linear Perturbations
Defining the fractional matter density perturbation as δ ρm/ρ¯m 1, we can ≡ − write the evolution of the linear density perturbation as
d2δ dδ + 2H 4πGρ¯mδ = 0 , (1.13) dt2 dt −
11 3 where the average matter densityρ ¯m is the background matter density ρΩm(1 + z) ,
and ρ is determined by the Friedman equation.
In the Einstein-de Sitter (EdS) universe where Ωm(z) = 1, which is a good approx-
imation for the matter-dominated era, the equation for linear perturbations reduces
to d2δ 3 dδ 3δ + = 0 . (1.14) da2 2a da − 2a2 The solutions contain a growing mode δ a, and a decaying mode δ a−3/2. At low ∝ ∝ redshifts, the only remaining mode should be the growing mode. When there is dark energy, the evolution of the Hubble parameter H is modified (larger than it would be in the EdS universe), and the structure growth rate will be suppressed. Actually, the growth factor δ/a becomes a function of the scale factor in the ΛCDM universe.
In cosmology, quantities, such as matter density, CMB temperature, galaxy num- ber density, etc., are all random fields with mean values and random fluctuations. We can only calculate the statistics of those perturbations and compare them with the observations. The simplest statistics one can do on the perturbations is to calculate the 2-point correlation functions and their Fourier transform, power spectra.
The 2-point correlation function is defined as the ensemble average of the product of the random field at two positions. Taking the density perturbation as an example, we have
ξ(r) δ(x)δ(x + r) , (1.15) ≡ h i where r is the separation between two positions. The 2-point correlation only depends on the magnitude of r because of the homogeneity and the isotropy in the statistical sense. In the Fourier space, we define the density modes and the power spectrum
12 P (k) as Z Z δ(k) = d3x δ(x)e−ik·x ,P (k) = d3x ξ(r)e−ik·r . (1.16)
Combining them with the definition of the correlation function, we obtain
δ(k)δ∗(k0) = (2π)3P (k)δ3 (k k0) , (1.17) h i D −
3 where δD is the 3-dimensional Dirac function. For CMB quantities, they are usually 2- dimensional functions of the sphere, and we usually use angular correlation functions
and their spherical harmonic transform, Cl’s to quantify their statistics.
1.3.3 Nonlinear Perturbations
There are many different ways of doing perturbation theory, but in this chapter
I will stay with the Eulerian perturbation theory (see Bernardeau et al., 2002, for a
detailed review). At the linear level, the uniform fluid equations contain the conti-
nuity equation involving the density and velocity divergence, and the Euler equation
involving the velocity and the stress tensor. For CDM, we can assume zero stress and
zero initial vorticity, which simplifies the problem to solving coupled equations of the
overdensity δ and the velocity divergence θ ∇ v, i.e., ≡ · ∂δ 1 ∂θ 3 + θ = 0 , + Hθ + Ω (a)H2aδ = 0 . (1.18) ∂t a ∂t 2 m
Combining these two equations leads to the linear perturbation equation of motion shown in Section 1.3.2.
Going beyond the linear approximation, we assume the density and velocity di- vergence field as expansions about the initial fields, i.e., the power-law ansatz,
∞ ∞ X X δ(x, t) = δ(n)(x, t) , θ(x, t) = θ(n)(x, t) , (1.19) n=1 n=1 13 where δ(n) and θ(n) contain n-th power of the initial density field and the initial
velocity divergence field.
Up to the leading nonlinear order, the fluid equations in Fourier space read as
Z 3 Z 3 ∂δ d k1 d k2 3 a + θ = δ (k k1 k2)α(k1, k2)θ(k1)δ(k2) , (1.20) ∂t − (2π)3 (2π)3 D − −
Z 3 Z 3 ∂θ 3 2 2 d k1 d k2 3 a +aHθ+ Ωm(a)a H δ = δ (k k1 k2)β(k1, k2)θ(k1)θ(k2) , ∂t 2 − (2π)3 (2π)3 D − − (1.21)
where the dimensionless kernel functions α(k1, k2), β(k1, k2) are defined as
2 k1 k12 k12k1 k2 α(k1, k2) = · 2 , β(k1, k2) = 2 ·2 , (1.22) k1 2k1k2 with k12 k1 + k2. ≡ In the EdS universe, neglecting the decaying mode, the Fourier solutions can be
written as convolution integrals
n Z 3 n ! Y d km X δ(n)(k) = δ(1)(k ) F (k , , k )δ3 k q , (1.23) (2π)3 m n 1 n D m m=1 ··· − m=1 n Z 3 n ! Y d km X θ(n)(k) = δ(1)(k ) G (k , , k )δ3 k q , (1.24) (2π)3 m n 1 n D m m=1 ··· − m=1
where the convolution kernels are given by recurrence relations: F1 = G1 = 1, and
n−1 X Gm(k1, , km) Fn(k1, , kn) = ··· [(2n + 1)α(k1···m, kn···m+1)Fn−m(km+1, , kn) ··· (2n + 3)(n 1) ··· m=1 − +2β(k1···m, kn···m+1)Gn−m(km+1, , kn)] , ··· (1.25)
n−1 X Gm(k1, , km) Gn(k1, , kn) = ··· [3α(k1···m, kn···m+1)Fn−m(km+1, , kn) ··· (2n + 3)(n 1) ··· m=1 − +2nβ(k1···m, kn···m+1)Gn−m(km+1, , kn)] , ··· (1.26)
14 Note that all the Fn and Gn kernels are dimensionless (hence scale-invariant). It
is often convenient to use the symmetrized kernels to take advantages of possible
symmetries in mode-couplings. The symmetrized kernels are defined as the avarage
of all possible permutations of their variables. As we will encounter in Chapter 2 and
3, The symmetrized F2 and G2 are given by
5 µ k2 k1 2 2 3 µ k2 k1 4 2 F2(k1, k2) = + + + µ ,G2(k1, k2) = + + + µ , (1.27) 7 2 k1 k2 7 7 2 k1 k2 7
ˆ ˆ where µ k1 k2 is the cosine of the angle between k1 and k2. ≡ · The leading order of the nonlinear corrections to matter power spectrum contain
(2) the P22 term, a coupling between two δ modes, and the P13 term, a coupling between the linear mode δ(1) and a δ(3) mode. We will focus on them in Chapter 2.
1.4 Observational Probes
In this section, I will give a basic review of several important observational probes, including the cosmic microwave background (CMB), the galaxy clustering, the weak gravitational lensing, the Type Ia supernovae, and the standard sirens. However, I will omit many other cosmological probes that have played important roles in measuring the cosmological parameters, as they are less relevant to the presented part of my
PhD research (see Weinberg et al., 2013, for a comprehensive review).
1.4.1 Cosmic Microwave Background
Travelling from the recombination epoch when the photons are last-scattered by electrons and decoupled with the baryons, the CMB has been one of the most im- portant observational pillars that the Big Bang theory rests on as well as the most informative sources for studying the universe since its discovery in the mid-1960s.
15 Before the last scattering, the collision rate between electrons and photons was much
higher than the expansion rate of the universe so that they were in equilibrium, which
ensures the CMB to be a blackbody spectrum. Due to the cooling by the Hubble ex-
pansion since redshift z 1100, current temperature of the CMB has dropped to ' about 2.725K (For a review up to 2009, see Fixsen, 2009). The CMB temperature
appeared to be isotropic until the data from RELIKT-1 (on board the Prognoz 9
satellite) and the COBE satellite unveiled the tiny anisotropies at order of 10−5K in
1992 (Strukov et al., 1992; Smoot et al., 1992). In 2002, the polarization of CMB was
first detected by the Degree Angular Scale Interferometer (DASI, Kovac et al., 2002) and two years later the E mode polarization spectrum was obtained by the Cosmic − Background Imager (CBI, Readhead et al., 2004).
The CMB temperature power spectrum (TT) itself contains lots of cosmological information. The overall normalization and tilt inform us of the amplitude of the
2 primordial density fluctuations ∆ζ and the spectral index ns. The positions and relative heights of the acoustic peaks help determine the physical baryon density
2 2 Ωbh as well as the physical matter density Ωmh .
The quadrupole moment of the CMB anisotropies can generate linear polarizations through Thomson scattering. There are two types of linear polarization patterns:
E and B mode, with different parity symmetries. They can be introduced from − − the Stokes parameters Q and U, which describe the polarization components along an orthogonal basis x y and a 45◦ tilted basis a b. The polarization then has − − magnitude P = pQ2 + U 2 and forms an angle ψ = 1 arctan(U/Q) with the x axis 2 − in the plane perpendicular to the line of sight. The parameters are function of the
16 direction nˆ over the sky, and they transform under rotation by an angle φ as
(Q iU)0(ˆn) = e∓2iφ(Q iU)(ˆn). (1.28) ± ±
It is convenient to expand those fields in their corresponding spin-weighted basis as
X X (Q + iU)(ˆn) = a+2Y +2(ˆn) , (Q iU)(ˆn) = a−2Y −2(ˆn) . (1.29) lm lm − lm lm lm lm Defining the linear combinations of the coefficients as
aE = (a+2 + a−2)/2 , aB = i(a+2 a−2)/2 , (1.30) lm − lm lm lm lm − lm
we are prepared to introduce the E and B mode, − −
X E X B E(ˆn) = almYlm(ˆn) ,B(ˆn) = almYlm(ˆn) . (1.31) l,m l,m These two quantities are invariant under rotations about the line of sight. However,
under the reflection about an arbitrary plane perpendicular to the sky, E remains
unchanged, while B flips the sign.
Several 2-point angular power spectra are defined as
1 X 1 X CTT = aT ∗aT ,CEE = aE∗aE , l 2l + 1 lm lm l 2l + 1 lm lm m m 1 X 1 X CBB = aB∗aB ,CTE = aT ∗aE , (1.32) l 2l + 1 lm lm l 2l + 1 lm lm m m
T where T is the CMB temperature fluctuation field and alm is the spherical harmonic
E B coefficients defined in the similar manner to alm and alm. TE is the cross correla- tion between temperature anisotropies and the E mode. The B mode polarization − − cannot be generated by Thomson scattering with only scalar perturbations, but some very early Universe physics, including the inflation, could produce tensor perturba- tions that imprint on the CMB B mode polarizations. The search for the primordial − 17 B mode has been the focus of a large portion of the CMB community. However, − there are many late time physics that can overwhelm the signal, including weak grav- itational lensing by intervening LSS, scattering by hot free electrons in galaxy clusters or during the reionization (I will show examples in Chapter 3), and various types of polarized foreground emissions from our own Galaxy. Thus, careful modeling and cleaning of those contaminations, often involving high-order statistics and nonlinear mode couplings, has become crucial for studying the very early Universe.
1.4.2 Galaxy Clustering
At late times, one of the most important observables is the distribution of galaxies, which serve as tracers of the matter distribution (see Desjacques et al., 2018, for a detailed review). The higher the number density of galaxies is, the higher the matter density would very likely be. The simplest way to relate them is to assume galaxies are formed in the overdense regions locally and definitively, then we can write the galaxy density fluctuation field as a series expansion of the matter density fluctuation
field. At linear order, the relation is simply described by a linear bias parameter bg, such that
δg = bgδm , (1.33)
where δg is the number density contrasts of the galaxy and δm is the matter density contrast. The galaxy power spectrum is then linearly related to the matter power spectrum by
2 Pgg(k) = bgPmm(k) . (1.34)
Measurements of the galaxy distribution are done by galaxy redshift surveys.
Several features in the galaxy 2-point correlation function (or power spectrum) can
18 be used for measuring the cosological parameters. First, the broadband shape of the
galaxy power spectrum help constrain the shape parameter Ωmh through the rollover
2 scale keq. This result, combining with the CMB constraint on Ωmh , breaks the degeneracy between Ωm and H0. Second, the baryonic acoustic oscillations (BAO) will produce a peak at comoving separation around 150 Mpc in the correlation function, or a set of wiggles in the matter power spectrum beyond k 0.05 h/Mpc. A measurement ∼ of the BAO scales along the radial direction and the transverse direction provides a good test of the cosmology. Third, the galaxy distribution does not look isotropic in the redshift space due to the redshift-space distortion (RSD) effect. Galaxies tend to move towards the overdense regions and have larger peculiar velocities when they are closer to the overdense peaks. In redshift surveys the peculiar velocities will add to the receding velocities due to the Hubble flow, and shift the lines towards redder or bluer depending on their locations relative to the overdense regions. As a result, the isotropic correlation contour will be squeezed along the radial direction on large scales (i.e., the Kaiser effect), and dramatically elongated along the radial direction on small scales where the peculiar velocities are large and random (i.e., the Fingers-
of-God effect). The RSD effects help constrain the growth function.
At a higher precision, nonlinearity of these effects has to be taken into account,
and the convolutions between these effects are also important. In perturbation theory,
they show up in the form of nonlinear mode-coupling integrals that we shall see in
Chapter 2 and 3.
19 1.4.3 Weak Gravitational Lensing
Weak gravitational lensing (WL) has been a very powerful technique to probe the
matter distribution. The light from a background source galaxy (in the direction of
θS) is deflected by all the intervening matter due to their gravitational potential, and
is projected to the image plane in the direction of θI . When the deflection angle is very small, we have
S I I θ = θ + ∇θψ(χ, θ ) , (1.35) where ψ (in the flat Universe) is the lensing potential given by
Z χ 1 1 0 0 ψ(χ, θ) = 2 0 Φ(χ , θ) dχ , (1.36) − 0 χ − χ
and Φ is the gravitional potential associated with the local density.
One of the results of the lensing is that the shapes of galaxies will be slightly
distorted, both the angular sizes and the ellipticities. These effects are quantified by
the magnification κ, and two shear parameters γ+ and γ×. The statistics of the shear
quantities is easier to measure than the magnification and has been used to constrain
cosmological parameters, such as σ8 and Ωm. One advantage of WL is that it may
only be affected by the baryonic feedback on the matter distribution, because baryons
only contain a small fraction of the matter.
Systematic errors that WL is subject to include the intrinsic alignments of galaxies
(we will talk about in Chapter 3), galaxy redshift distribution uncertainties arising
from the photometric redshift (photo-z) measurements of source galaxies2, and the
point spread function (PSF) due to the smearing by the atmosphere or due to the
optics and imaging.
2Redshift distribution uncertainties of source galaxies bring uncertainties in the lensing power spectrum and are degenerate with the inferred cosmological parameters.
20 Cross correlations between the galaxy distribution and the shear are also done to break the degeneracy between σ8 and the galaxy bias bg, as well as mitigate some of the sysmtematics associated with WL.
1.4.4 Type Ia Supernovae (SNe Ia)
SNe Ia, recognized as “standard candles”, have played crucial roles in cosmology since they are so bright that they can be observed on cosmological scales, and have features that can be used to indicate their distances.
The early classification of supernovae relies on their spectra. When the supernova arrives at its peak luminosity, if it has hydrogen lines in the spectrum, it is categorized as Type II; otherwise it is Type I. And then, if there are strong silicon lines (Si II absorption), it is defined as Type Ia; otherwise, it is Type Ib if helium is present, or
Type Ic if no helium. The classification of Type II SNe involves the shape of the light curve and the continuum in the spectrum and I will not elaborate here. The SNe can also be classified based on their late-time ( 6 months after the peak) nebular spectra, ∼ in which Type Ia SNe show strong emission lines of [Fe II], [Fe III], and probably [Co
III], with Ca II absorption lines, but without obvious evidence for oxygen.
Supernovae have been understood as the final explosions of massive stars or the thermonuclear explosions of white dwarfs. The latter is thought to produce SNe Ia.
They are “standard” in the sense that their peak luminosity in the V band is roughly constant (-19.5 mag with rms about 0.4 mag). Corrections can be made by taking advantage of, e.g., correlations between the light curve of the SN Ia and its peak luminosity, such as the Phillips relation (Phillips, 1993)
MV,peak = 20.88 + 1.95∆m15(B) , (1.37) −
21 where MV,peak is the peak absolute magnitude, ∆m15(B) is the magnitude drop in 15 days after peak in the rest-frame B band. Along with the apparent magnitude m, the luminosity distance can be determined by relation dL m MV = 5 log , (1.38) − 10 10 pc
which helps constrain Ωm, ΩΛ, and H0.
Many systematics may come into play, such as extinctions, multi-band detector calibration errors, intrisic variations of SN Ia luminosities in term of the redshifts or the progenitor properties. In fact, a big puzzle about SNe Ia is that we do not know what produce them. We know that there must be at least one WD involved, but is it one WD accreting materials from its giant companion, or is it a merger of two WDs? How does the runaway explosion happen? All of these questions are interesting in many aspects, and understanding the physics would also help better
“standardize” them for distance measurements and precision cosmology. I will focus on the progenitor problem in Chapter 4.
1.4.5 Standard Sirens
Binary blackholes (BHs) and binary neutron stars (NSs) have presented an inde- pendent way to measure distances since the recent discoveries by LIGO. To simply illustrate the idea, we consider a circular BH binary with total mass M, reduced mass µ, and separation a. We assume the orbit nearly follows the Newtonian gravity, which means that the orbital velocity is v = pGM/a and the orbital frequency is
ω = pGM/a3. The leading-order GW emission results in an orbital decay with a
frequency change given by (Peters, 1964)
ω˙ 96G3µM 2 = , (1.39) ω 5c5a4 22 All of these quantities will be encoded in the GW signal. For example, the GW fre- quency will be 2ω due to the quadrupole nature of the GWs. The increase rate of the frequency will be measured from many cycles of GWs during the inspiral phase.
The velocity can be measured from the relativistic corrections to the rest-frame wave- form. Then, the masses and orbital parameters can be solved. The GW luminosity, associated with the frequency changing rate, can be evaluated and compared with the observed GW strength to determine the luminosity distance. In practice many more parameters need to come into play, such as the eccentricity, inclination, source posi- tion, etc., so very detailed modeling is needed to extract information from different stages of the BH mergers.
One drawback of this method is that an independent source redshift is required, so one has to identify its host galaxy. However, the identification of the host galaxy is much easier for sources with electromagnetic couterparts. A perfect example is the neutron star merger detected last year (GW170817), which produced GW signals and a bright electromagnetic source, allowing an independent measurement of the Hubble constant (Abbott et al., 2017c).
1.5 Compact Objects and Their Dynamics
1.5.1 White Dwarfs, Neutron Stars and Black Holes
WDs and NSs are both remnants of the stellar evolution.
The progenitors of WDs are usually stars with masses below 8 M . Althought stellar evolution theory predicts that main-sequence stars above about 0.07 M will end up with a WD, most of observed WDs should have originated from stars with masses above 1 M due to the finite age of the Universe. For a main-sequence star
23 with mass roughly 1 to 6 M , its core is hot enough to start helium burning via the triple-alpha process and produce carbon and oxygen. As a result, the star will have an outer hydrogen-burning shell, an inner helium-burning shell and a carbon-oxygen core, and it will enter the asymptotic giant branch (AGB) stage. After roughly a million year, most of the outer material will be expelled, and the remaining is a carbon-oxygen WD. More massive stars may be able to burn carbon into neon and magnesium, and create oxygen-neon-magnesium WDs.
WDs are supported by the electron degenerate pressure and typically have radii around a few thousand kilometers and masses lower than 1.44 M , the well-known
Chandrasekhar limit. New-born WDs may have effective surface temperature above
105 K, while old WDs may be only a few thousand kelvin due to cooling.
A star more massive than 8 M may undergo core-collapse and the remnant of
the supernova explosion may be a NS or a BH. A typical NS has mass about 2 M
(and maximal mass likely below 3 M ), and radius about 10 km. NSs are supported by the neutron degenerate pressure. A remnant more massive than 3 M may turn into a BH. The Schwarzschild radius of a BH is determined by its mass,
2GM M rs = 2 3 km . (1.40) c ' M
However, BHs may form in other ways and can have very different masses. The
center of massive galaxies may host supermassive black holes (SMBH) with billions
of solar masses, and the center of globular clusters may also host intermediate-mass
black holes with thousands of solar masses. BHs may also form from primordial
small-scale density perturbations or from other exotic ways.
24 1.5.2 Concepts in Orbital Dynamics
Compact objects not only play significant roles in cosmology and astrophysics, but
also exhibit an excellent laboratory for condensed matter physics, nuclear physics and
test of GR in the strong gravity regime. One of the most effective ways to learn about
compact objects is to study the motion of them in binary systems or study the motion
of objects around them. In Chapter 4 and 5, I study binary WDs in quadruple systems
and binary BHs in triple systems. The orbital dynamics is essential in connecting the
merger rate and the initial properties of the systems. I will briefly introduce some
basic concepts in the orbital dynamics.
The simplest orbital dynamics considers the motions of two particles under their
gravity, the two-body problem. One can completely solve the problem with the conser-
vation of energy and angular momentum. The trajectory will be elliptic or parabolic
or hyperbolic, depending on their total energy. The trajectory is closed due to the
degeneracy of the motion (which also leads to an additional intergral of the motion,
the Laplace-Runge-Lenz vector, Landau & Lifshitz, 1969). In the 3-dimensional
space, a particle on an arbitrary elliptic orbit can be specified by 6 orbital elements:
semi-major axis a, eccentricity e, inclination i, longitude of the ascending node Ω, ar-
gument of the pericenter ω, true anomaly f. a, e determine the size and shape of the
ellipse, i, Ω determine the orientation of the orbital plane with respect to the reference
plane, ω determines the orientation of the ellipse in the orbital plane, and finally f determines the particle’s location on the orbit. In a dynamics problem, to predict the location of an object on the orbit, one must also specify a constant associated with the time, i.e., the epoch, which sometimes is considered as the 7th orbital parameter.
25 Chapter 2: FAST-PT I: a novel algorithm to calculate convolution integrals of scalar quantities in cosmological perturbation theory
In this chapter, I will present a shortened version of our first FAST-PT paper
(McEwen et al., 2016), focusing on the key idea of the algorithm and its achieved efficiency. The original abstract is given below:
We present a novel algorithm, FAST-PT, for performing convolution or mode- • coupling integrals that appear in nonlinear cosmological perturbation theory.
The algorithm uses several properties of gravitational structure formation – the
locality of the dark matter equations and the scale invariance of the problem – as
well as Fast Fourier Transforms to describe the input power spectrum as a super-
position of power laws. This yields extremely fast performance, enabling mode-
coupling integral computations fast enough to embed in Monte Carlo Markov
Chain parameter estimation. We describe the algorithm and demonstrate its
application to calculating nonlinear corrections to the matter power spectrum,
including one-loop standard perturbation theory and the renormalization group
approach. We also describe our public code (in Python) to implement this al-
gorithm. The code, along with a user manual and example implementations, is
available at https://github.com/JoeMcEwen/FAST-PT .
26 The orginal authors are: J. McEwen, X. Fang, C. Hirata, J. Blazek.
2.1 Introduction
A generic feature of nonlinear perturbation theory is the coupling of modes at different scales through kernels that capture the physics of structure growth. As a result, these nonlinear corrections typically appear as convolutions over the power spectrum or related functions of the wavevector. In this paper, we primarily con- sider the most ubiquitous of these approaches, standard perturbation theory (SPT, e.g., Bernardeau et al., 2002). However, integrals with a similar structure are found in other approaches as well, including Lagrangian perturbation theory (LPT, Sugiyama,
2014), renormalized perturbation theory (RPT, Crocce & Scoccimarro, 2006), renor- malization group perturbation theory (RGPT, McDonald, 2007, 2014, also consid- ered in this work), the effective field theory (EFT, Baumann et al., 2012; Carrasco et al., 2012; Pajer & Zaldarriaga, 2013; Hertzberg, 2014) approach to structure for- mation, and time renormalization frameworks (Audren & Lesgourgues, 2011), which can include scale-dependent propagators for the fluctuation modes (e.g., arising from massive neutrinos). Therefore, it is of great utility that the cosmological community have access to efficient and accurate methods to compute these integrals.
The applicability of perturbative techniques is not limited to dark matter evolu- tion. A number of cosmological observables can be modeled in the weekly non-linear regime. These include the clustering of galaxies and other luminous tracers, as well as weak gravitational lensing and cross-correlations between these probes (e.g. “galaxy- galaxy lensing.”). For instance, the relationship between dark matter and luminous
27 tracers will generally include a nonlinear “biasing” relationship, resulting in corre-
lations that are naturally described in a perturbative expansion (e.g., McDonald,
2006; McDonald & Roy, 2009; Saito et al., 2014). Many cosmological analysis limit
their scope to the weakly non-linear regime, where the majority of the information
is, and employ a bias expansion to constrain cosmological parameters (Mandelbaum
et al., 2013; Kwan et al., 2016) as well as, e.g., the total neutrino mass (Saito et al.,
2011; Zhao et al., 2013). For instance, 2 of Kwan et al. (2016) demonstrates a recent § application of nonlinear biasing. In the absence of a fast algorithm for performing the relevant convolutions, that work used emulation, calibrated with the results of a conventional method, to obtain the correct contributions at arbitrary cosmological parameters.
Perturbative techniques can predict the nonlinear shift and broadening of the baryon acoustic oscillation (BAO) feature (Crocce & Scoccimarro, 2008; Sugiyama &
Spergel, 2014) – a powerful “standard ruler” for studying the evolution of geometry in the universe – including the potential impact of streaming velocities between baryons and dark matter in the early universe (Yoo et al., 2011; Yoo & Seljak, 2013; Slepian &
Eisenstein, 2015; Blazek et al., 2016). The velocity field of dark matter and luminous tracers, which sources “redshift-space distortions” in clustering measurements can also be modeled analytically beyond linear theory (e.g., Scoccimarro, 2004; Vlah
et al., 2012). Similarly, correlations of intrinsic galaxy shapes (known collectively as
“intrinsic alignments”) must be included in cosmic shear analyses and can be described
perturbatively (e.g., Hirata & Seljak, 2004; Blazek et al., 2015).
Although these examples indicate the broad applicability of perturbative tech-
niques, some analyses will probe regimes where numerical simulations are required to
28 reach the desired accuracy. Even in these cases, however, a fast perturbation theory code is still valuable, since interpolation (or emulation, Heitmann et al., 2014) from grids of simulations can be used to compute the non-perturbative correction to an observable , rather than trying to interpolate the much larger “raw” value of . O O In this paper we present FAST-PT, a new algorithm and publicly available code to calculate mode coupling integrals that appear in perturbation theory. As a first example of our method we focus on 1-loop order perturbative descriptions of scalar quantities (e.g. density or velocity divergence). In particular, we present examples for 1-loop SPT, which can be trivially expanded to include nonlinear galaxy biasing, and renormalization group results. A generalization to arbitrary-spin quantities (e.g. intrinsic alignments, a spin-2 tensor field) and other directionally dependent power spectra (e.g. redshift-space distortions and secondary CMB anisotropies) will be pre- sented in a follow-up paper Fang et al. (2017).
FAST-PT can calculate the SPT power spectrum, to 1-loop order to the same level of accuracy as conventional methods, on a sub-second time scale. In the context of
Monte Carlo Markov chain (MCMC) cosmological analyses, which may explore > 106 points in parameter space, the extremely low recurring cost of our method is partic- ularly relevant. The FAST-PT recurring cost to calculate the 1-loop power spectrum at N = 3000 k values is 0.01s. This speed is even more valuable for multi-probe ∼ cosmological analyses. For instance, a gravitational lensing plus galaxy clustering analysis may require the matter and galaxy power spectra in real and redshift space, nonlinear galaxy biasing contributions, and the intrinsic alignment power spectra, at each point in cosmological parameter space. FAST-PT provides a means to obtain these quantities in a time that is likely trivial compared to other necessary calculations at
29 each step in the chain (e.g. obtaining the linear power spectrum from a Boltzmann
code).
FAST-PT takes a power spectrum, sampled logarithmically, as an input. Special function identities are then used to rewrite the angular dependence of the mode- coupling kernels in terms of a summation of Legendre polynomials. The angular integration for each of these components can be performed analytically, reducing the numerical evaluation to one-dimension. Because of the uniform (logarithmic) sampling we are able to utilize Fast Fourier Transform (FFT) methods, thus enabling computation of the mode-coupling integrals in (N log N) operations, where N is the O number of samples in the power spectrum. Our approach is similar in structure to the
evaluation of logarithmically sampled Hankel transforms (Talman, 1978; Hamilton,
2000), which have been used to transform power spectrum into correlation functions
(and vice versa). It also draws on the realization that convolution integrals in spherical
symmetry – even convolutions of integrands with spin – can be expressed using Hankel
transforms with the angular integrals performed analytically (e.g., Ferraro et al.,
2012; Slepian & Eisenstein, 2015). We implement the FAST-PT algorithm in a publicly-
available package. The code is written in Python, making use of numpy and scipy
libraries, and has a self-contained structure that can be easily integrated into larger
packages. We provide a public version of the code along with a user manual and
example implementations at https://github.com/JoeMcEwen/FAST-PT.
Recently, Schmittfull et al. (2016) have presented a related method for fast pertur-
bation theory integrals, based on the same mathematical principles. Our Eq. (2.16)
encapsulates the same approach as their Eq. (31), combined with the logarithmically
30 sampled Hankel transform. However, the numerical approach is different: the decom-
position of an arbitrary power spectrum P (k) into power laws of complex exponent
is treated as fundamental (and is kept explicitly in the code); the near-cancellation
of P22 + P13 is handled by explicit regularization; and the P13 integral is solved us-
ing a different method (based only on scale invariance). Finally, we present a fast
implementation of RGPT.
This paper is organized as follows: in 2, we provide the theory for our method, § motivating the approach by considering the 1-loop SPT power spectrum. In 3, we § provide results for 1-loop corrections to the power spectrum and demonstrate an im-
plementation of the renormalization group approach of McDonald (2007, 2014). In
4, we summarize our results, including a discussion of other potential applications § of FAST-PT, and provide a brief description of the publicly-available code. The appen-
dices provide additional details of our numerical calculations and the mathematical
structure of the terms under consideration.
2.2 Method
This work presents an algorithm to efficiently calculate mode-coupling integrals
of the form
Z d3q K(q, k q)P (q)P ( k q ) , (2.1) (2π)3 − | − |
where K(q1, q2) is a mode-coupling kernel that can be expanded in Legendre poly-
nomials and P (q) is an input signal logarithmically sampled in q. The motivation for
this method is mildly-nonlinear structure formation in the universe, although it can
be more generally considered as a technique to evaluate a range of expressions in the
form of Eq. (2.1).
31 For clarity we list our conventions and notations:
fast Fourier transform and inverse fast Fourier transform are denoted as FFT • and IFFT;
Fourier transform pairs have the 2π placed in the denominator of the wavenum- • ber integral, as is standard in cosmology:
Z Z d3k Φ(k) = d3r Φ(x) e−ik·r Φ(r) = Φ(k) eik·r; (2.2) ↔ (2π)3
“log” always refers to natural log and we will use log explicitly when we are • 10 referring to base 10;
represents a convolution (discrete or continous); •⊗
the Legendre polynomials will be denoted l (to avoid confusion with power • P
spectra P ), normal Bessel functions of the first kind are denoted Jµ(t), and
spherical Bessel functions of the first kind are denoted jl(t), all with standard
normalization conventions (Abramowitz & Stegun, 1964);
i = √ 1 (never used as an index); • −
“log sampling” means that the argument of the input signal is qn = q0 exp(n∆), • where n = 0, 1, 2, ... and ∆ is the linear spacing between grid points;
we use the convention that when calculations require discrete evaluations, for • example as in the case of discrete Fourier transforms, we index our vectors,
while when evaluations are performed analytically we omit the index.
Since we have introduced the leading-order nonlinear corrections from standard perturbation theory in 1.3, in this section we begin with 2.2.1 describing our main § § 32 result: a rearrangement of the mode-coupling integral that allows P22 and related integrals to be computed in order N log N operations. The P13 integral is simpler than
P22, but brute-force computation of P13 is in fact slower than the FAST-PT method for
P22, so we describe our fast approach to P13 in 2.2.2. Finally, in 2.2.3 we describe § § our numerical treatment of the cancellation of infrared divergences in P22 and P13.
2.2.1 P22(k) type Convolution Integrals
We first focus on P22(k), leaving the evaluation of P13(k) to a later subsection.
P22(k) is a convolution integral that takes two copies of the linear power spectrum
Plin(k) as inputs:
Z 3 d q 2 P22(k) = 2 Plin(q)Plin( k q ) F2(q, k q) . (2.3) (2π)3 | − | | − |
The F2 kernel is 5 1 q1 q2 2 2 F2(q1, q2) = + µ12 + + µ12 7 2 q2 q1 7 (2.4) 17 1 q1 q2 4 = 0(µ12) + + 1(µ12) + 2(µ12) , 21P 2 q2 q2 P 21P where we have defined µ12 = q1 q2/(q1q2) = qˆ1 qˆ2, which is the cosine of the angle · · between q1 and q2. Squaring this and substituting into Eq. (2.3), we find that the
P22(k) power spectrum expanded in Legendre polynomials is
Z 3 d q1 h1219 671 32 1 2 −2 P22(k) = 2 0(µ12) + 2(µ12) + 4(µ12) + q q 2(µ12) (2π)3 1470P 1029P 1715P 3 1 2 P
62 −1 8 −1 1 2 −2 i + q1q 1(µ12) + q1q 3(µ12) + q q 0(µ12) Plin(q1)Plin(q2) , 35 2 P 35 2 P 6 1 2 P (2.5) where we have defined q2 = k q1 and used the q1 q2 symmetry to combine − ↔ terms. We note that the last Legendre component in Eq. (2.5) will eventually lead to a formally divergent expression in the FAST-PT framework. In 2.2.3 we discuss § 33 this type of divergence (which can appear in other contexts) and explicitly show the
cancellation.
Each Legendre component of Eq. (2.5) is a specific case of the general integral
Z 3 d q1 α β Jαβl(k) = q q l(µ12)P (q1)P (q2) . (2.6) (2π)3 1 2 P
Note that we have now omitted the subscript “lin” on the power spectrum and carry
on our calculations for a general input power spectrum. For SPT calculations the
input power spectrum should be Plin(k), however there are cases when a general power spectrum input is required, such as renormalization group equations. Our method of evaluation draws on several key insights from the literature. The first is that the Legendre polynomial can be decomposed using the spherical harmonic addition theorem, and that in switching between real and Fourier space one may use the spherical expansion of a plane wave to achieve separation of variables; see the
Appendix of Slepian & Eisenstein (2015). The second is the fast Hankel transform
(Talman, 1978; Hamilton, 2000). We also address a number of subtleties to make these ideas useful for the 1-loop SPT integrals.
Our goal in this section is to develop an efficient numerical algorithm to evaluate integrals of the form Eq. (2.6). Combining the results for the relevant values of (α, β, l) will then allow us to construct P22(k) or other similar functions. For instance, in terms of these components, Eq. (2.5) reads h1219 671 32 P22(k) = 2 J0,0,0(k) + J0,0,2(k) + J0,0,4(k) 1470 1029 1715 (2.7) 1 1 62 8 i + J (k) + J (k) + J (k) + J (k) . 6 2,−2,0 3 2,−2,2 35 1,−1,1 35 1,−1,3
34 To evaluate Eq. (2.6) we first Fourier transform to configuration space and then expand the Legendre polynomials in spherical harmonics, using Eq. (A.1): Z d3k J (r) = eik·rJ (k) αβl (2π)3 αβl Z 3 3 d q1 d q2 = ei(q2+q1)·rqαqβP (µ)P (q )P (q ) (2π)3 (2π)3 1 2 l 1 2 (2.8) l Z 3 3 4π X d q1 d q2 = eiq1·reiq2·rqαqβY (qˆ )Y ∗ (qˆ )P (q )P (q ) . 2l + 1 (2π)3 (2π)3 1 2 lm 1 lm 2 1 2 m=−l
R ∞ 2 The q1 and q2 integrals can each be broken into a radial ( 0 dq1 q1) and angular
R 2 ( S2 d qˆ1) part; the angular parts do not depend on the power spectrum and can be evaluated analytically using Eq. (A.4):
l 4π(4π il)2 X Z ∞ Z ∞ J (r) = Y (ˆr)Y ∗ (ˆr) dq q2+αj (q r)P (q ) dq q2+βj (q r)P (q ). αβl (2π)6(2l + 1) lm lm 1 1 l 1 1 2 2 l 2 2 m=−l 0 0 (2.9)
Additionally we make use of the orthogonality relation, Eq. (A.2), to eliminate the sum over m:
l Z ∞ Z ∞ ( 1) α+2 β+2 Jαβl(r) = − 4 dq1q1 jl(q1r)P (q1) dq2q2 jl(q2r)P (q2) . (2.10) 4π 0 0
Equation (2.10) can be considered as one component of a correlation function. For instance, the correlation function ξ22(r) [the Fourier counterpart to P22(k)] is built from Eq. (2.10) with the same α, β, l combinations and pre-factors as in Eq. (2.5).
Equation (2.10) is the product of two Hankel transforms (terms in brackets) with the relevant prefactor. We denote the bracketed terms in Eq. (2.10) as Iαl(r) and
Iβl(r). To evaluate Iαl(r), we first take the discrete Fourier transformation of the power spectrum (biased by a power of k):
N−1 N/2 X P (kn) −2πimn/N X ν+iηm cm = Wm e P (kn) = cmk , (2.11) kν ↔ n n=0 n m=−N/2
35 where N is the size of the input power spectrum, ηm = m 2π/(N∆), m = N/2, N/2+ × − −
1, ..., N/2 1,N/2 and ∆ is the linear spacing, i.e. kn = k0 exp(n∆). For real power − ∗ spectrum the Fourier coefficients obey cm = c−m. Here Wm is a window function that can be used to smooth the power spectrum.3 Using discrete FFTs allows a significant reduction in computation time. However, these methods require that the function be- ing transformed is (log-)periodic. In the case of FAST-PT, this procedure is equivalent to performing calculations in a universe with a power spectrum, biased by a power- law in k, that is log-periodic. This universe has divergent power on large or small scales, depending on the choice of ν. Figure 2.1 shows the resulting power spectrum, with a window function applied at the periodic boundaries. In order for perturba-
R k 02 0 0 tion theory to make sense, the large-scale density variance (i.e. 0 k P (k )dk ) and
R ∞ 0 0 the small-scale displacement variance (i.e. k P (k ) dk ) should both be finite (see
Fig. 2.1). Since P (k)/kν is log-periodic, this means that FAST-PT will require biasing
with 3 < ν < 1 (this paper chooses ν = 2). − − − In most cases, sufficiently far from the boundaries, the impact of the periodic na-
ture of the P (k) is negligible. However, while P22(k) and P13(k) are well-behaved in
standard methods with CDM power spectra, they are both infinite in FAST-PT where
the satellite features at extremely large scales (k 0) produce infinite displacements. → This is the same infinity found in power-law spectra and is of no physical concern:
since displacement is not a physical observable, Galilean invariance guarantees that
3 1 1 If no smoothing is desired, we would set Wm = 1 for all m except for W±N/2 = 2 . The 2 ensures that the counting of both m = N/2 in the second sum in Eq. (2.11) is the correct inverse transform. However, in our numerical implementation± we always include a window function that goes smoothly to zero to prevent “ringing” in the interpolated P (k); see Appendix A.3.
36 102 ν
k 10-2 / ) -6 k 10 ( P 10-10
106 ) 102 k -2
( 10 -6
2 10 10-10 ∆ 10-14 10-18
2 106 k 102 / -2
) 10 -6 k 10 ( 10-10 2 10-14 -18 ∆ 10 10-9 10-7 10-5 10-3 10-1 101 103 105 107 k
Figure 2.1: Power spectra in the log-periodic universe. Top panel shows the windowed linear power spectrum biased by k−ν (we choose ν = 2), with grey lines indicating the “satellite” power spectra, i.e. the contribution to− the total power spectrum that arises due to the periodic assumption in a Fourier transform. The middle panel plots ∆2(k) = k3P (k)/(2π2), within the periodic universe. This is the quantity that sources the density variance σ2 = R d ln k∆2(k). The bottom panel plots the contribution to R 2 2 the displacement variance σξ = d ln k∆ (k)/k .
the divergent parts of P22(k) and P13(k) will cancel as long as the displacement gradi- ent or strain is finite. In 2.2.3 we will address the numerical aspects of this cancella- § tion and show how to perform a well-behaved 1-loop SPT calculation in the FAST-PT framework.
37 Continuing our evaluation: Z ∞ α+2 Iαl(r) = dk k jl(kr)P (k) 0 N/2 Z ∞ X ν+2+α+iηm = cm dk k jl(kr) m=−N/2 0 r N/2 ∞ (2.12) π X Z = c r−3−ν−α−iηm dt t3/2+ν+α+iηm J (t) 2 m l+1/2 m=−N/2 0 r N/2 π X = c g r−3−ν−α−iηm 2Qαm , 2 m αm m=−N/2
where in the third equality we have exchanged the Bessel function of the first kind for a p spherical Bessel function, jν(z) = π/(2z) Jν+1/2(z) and performed the substitution
t = kr. In the last equality we have evaluated the integral according to Eq. (A.5) and
1 3 defined gαm g(l + ,Qαm) and Qαm + ν + α + iηm. ≡ 2 ≡ 2 Strictly speaking, the convergence criteria for Eq. (2.12) are α < 1 ν and − − α + l > 3 ν. For ν = 2 we thus require (i) α < 1 and (ii) α + l > 1. All terms − − − − with α = 2 violate (i), while the α = 2, l = 0 term also violates (ii). The violations of
condition (i) can be cured if we apply an exponential cutoff in the power spectrum to
force the integral to converge, i.e. in Eq. (2.12) we insert a factor of e−k and take the
limit as 0+; this yields the same result and is equivalent to smoothing out the → “wiggles” in the Bessel functions at high k.4 The violation of condition (ii) comes from
the low k’s and is more problematic: the physical result for I−2,0(r) is divergent, and
this will be treated in 2.2.3. The final result for the Jαβl(r) correlation component § 4This can be proven by inserting a factor of e−t in Eq. (A.5) and taking the limit as 0+. Following Eq. (6.621.1) of Gradshteyn & Ryzhik (1994), the integral can be expressed in terms→ of the µ+κ+1 µ+κ+2 −2 hypergeometric function 2F1( 2 , 2 ; µ+1; ). The transformation formula, Eq. (9.132.2) of Gradshteyn & Ryzhik (1994), can then be used to− express a hypergeometric function of large argu- −2 ment in terms of functions of argument approaching 0. Using limz→0 2F1(α, β; γ; z) = 1 and the− Γ-function→ −∞ duplication formula suffices to prove this generalized version of Eq. (A.5).
38 is then ( 1)l Jαβl(r) = − Iαl(r)Iβl(r) 4π4 N/2 N/2 (2.13) ( 1)l X X Qαm+Qβn −6−2ν−α−β−iηm−iηn = − cmcngαmgβn 2 r . 8π3 m=−N/2 n=−N/2 To obtain the power spectrum, we Fourier transform Eq. (2.13) back to k-space: Z ∞ 2 Jαβl(kq) = dr 4πr j0(kqr)Jαβl(r) 0 N/2 N/2 ( 1)l Z ∞ sin(k r) X X Qαm+Qβn −5−2ν−α−β−i(ηm+ηn) q = − 2 cmgαmcngβn2 dr r , 2π kq m=−N/2 n=−N/2 0 (2.14) where in the first equality homogeneity converts the 3-dimensional Fourier transform into a Bessel integral, and then we have used j0(z) = sin(z)/z. The integral over r can be evaluated using the f-function of Eq. (A.7) via the substitution t = kqr and leads to N/2 N/2 ( 1)l X X Qh −p−2+iτh Jαβl(kq) = − cmgαmcngβn2 k fh , (2.15) 2π2 q m=−N/2 n=−N/2 where we have defined fh = f(p + 1 iτh), p = 5 2ν α β, τh = ηm + ηn, and − − − − −
Qh = Qαm +Qβn. Note that τh (and hence fh) and Qh depend only on the sum m+n.
In what follows, we will transform a double summation over m and n into a discrete convolution, indexed by h, such that h = m + n N, N + 1, ..., N 1,N . This ∈ {− − − } leads to: N/2 N/2 ( 1)l 3+2ν+α+β X X −p−2+iτh iτh Jαβl(kq) = − 2 cmgαmcngβn fhk 2 2π2 q m=−N/2 n=−N/2 ( 1)l 2+2ν+α+β X iτh −p−2+iτh = − 2 [cmgαm cngβn]hfh2 k π2 ⊗ q h (2.16) ( 1)l 2+2ν+α+β −p−2 X iτh = − 2 k Chfh2 exp(iτh log k0) exp(iτhq∆) π2 q h ( 1)l 2+2ν+α+β −p−2 iτh = − 2 k IFFT[Chfh2 ] , π2 q
39 where in the second equality we have replaced n = h m and in the third and fourth − P equality the sum over m is written as a discrete convolution m cmgαmch−mgβ,h−m =
[cmgαm cngβn]h = Ch. Also, due to the log sampling of kq the final sum over ⊗
P iτh h in Eq. (2.16) is actually an inverse discrete Fourier transform, i.e. h Ahkq =
P 5 h Ah exp(i2πhq/[2N]) , and can thus be evaluated quickly using an FFT. Equa- tion (2.16) is the main analytical result of this work, it allows one to evaluate P22(k) type integrals quickly, scaling with N log N.
Since in FAST-PT P (k)/kν is log-periodic, there are discontinuities in the power
N∆ spectrum at kmin = k0 and kmax = k0e . This means that when Fourier-space methods are applied, the series of Eq. (2.11) will exhibit ringing; the FAST-PT user has several options for controlling this behavior. The power spectrum can be windowed in such a way that the edges of the array are smoothly tapered to zero (of course, this must be done outside the k-range that contributes significantly to the mode- coupling integrals). The location of the onset of the tapering is controlled by the user. The Fourier coefficients cm can also be filtered so that the highest frequencies
are damped. We use the same window function to filter the Fourier coefficients
and smooth the edges of the power spectrum – the functional form is presented in
Appendix A.3. In practice, while we always apply a filter to the cm coefficients, we choose to directly window the power spectrum only within our renormalization group routine (see Appendix A.4). We have also written the code in such a way that the user can easily implement their own window function. One can also “zero pad” the input power spectrum, adding zeros to both sides of the array. The contributions of
5 In the last two lines of Eq. (2.16) a shift, exp(iτh log k0), in the Fourier transform appears. In practice, our code does not compute this shift which also appears in the initial Fourier transform and thus cancels. Additionally, to conform to Python Fourier conventions we drop the positive end point in the final FFT.
40 the mode-coupling integrals from the large-scale satellite power spectrum (k < kmin) heavily contaminate P22(k) at k < 2kmin (range restricted by the triangle inequality).
We thus recommend zero-padding by a factor 2. ≥
2.2.2 P13(k) type Convolution Integrals
The P13(k) integral does not share the same form as P22(k), since the wavenumber structure is different: it describes a correction to the propagator for Fourier mode k due to interaction with all other modes q. The structure of P13(k) is thus P (k) times an integral over the power in all other modes:
3 Z ∞ k 2 P13(k) = 2 Plin(k) dr r Plin(kr)Z(r) , (2.17) 252(2π) 0 where
12 158 3 r + 1 Z(r) = + 100 42r2 + (7r2 + 2)(r2 1)3 log , (2.18) r4 − r2 − r5 − r 1 | − | and r = q/k. Upon making the substitution r = e−s, Eq. (2.17) becomes
3 Z ∞ k 2 P13(k) = 2 Plin(k) dr r Plin(kr)Z(r) 252(2π) 0 3 Z ∞ k −3s log k−s −s = 2 Plin(k) ds e Plin(e )Z(e ) (2.19) 252(2π) −∞ k3 Z ∞ = 2 Plin(k) ds G(s)F (log k s) , 252(2π) −∞ − where in the final line we reveal the integral as a continuous integral with the following
−3s −s s definitions G(s) e Z(e ) and F (s) Plin(e ). In the discrete domain we have ≡ ≡ ds ∆, log kn = log k0 + n∆, and sm = log k0 + m∆, so that the discrete form is → N−1 Z ∞ X ds G(s)F (log k s) ∆ GD(m)FD(n m) , (2.20) −∞ − → m=0 −
41 where in the final line we define the discrete functions GD(m) G(sm) and FD(m) ≡ ≡ F (m∆), so that we have
3 kn P13(kn) = Plin(kn)∆[GD FD][n] . (2.21) 252(2π)2 ⊗
2 Thus P13(k), which at first appears to involve order N steps (an integral over N samples at each of N output values kn) can in fact be computed for all output kn in
N log N steps.
2.2.3 Regularization
As mentioned above, we need to regularize the divergent portion in P22(k) with
P13(k). In standard calculations in a ΛCDM universe, the suppression of power on
large scales [P (k) kn, n > 1] controls this divergence, allowing the numerical ∝ − evaluation of each term separately. The relevant cancellation will then occur upon
addition of the terms, as long as sufficient numerical precision has been achieved.
However, because the FAST-PT method relies on FFTs, the “true” underlying power
spectrum is log-periodic, leading to non-vanishing power on infinitely large (and small)
scales. These divergences are thus numerically realized and must be analytically re-
moved before evaluation. Physically the divergences are due to the artificial breaking
of local Galilean invariance when the 1-loop SPT power is split into P22(k) and P13(k):
a long-wavelength (q k) velocity perturbation displaces small-scale structure with- out affecting its evolution, but since the perturbative expansion terms δ(n) are defined
with respect to a stationary background, each term in perturbation theory shows a
divergence even when the physically relevant sum does not. This fact is well-known
in the context of P22 + P13 (Vishniac, 1983) and has been generalized to higher orders
(Jain & Bertschinger, 1996; Scoccimarro & Frieman, 1996).
42 We construct our regularization scheme so that it preserves the 1-loop contribution
to power spectrum, i.e.
P22(k) + P13(k) = P22,reg(k) + P13,reg(k) , (2.22) where the subscript“reg”stands for regularization, by subtracting out the contribution to P13(k) from small q = kr in Eq. (2.17), and adding it to the J2,−2,0(k) contribution
in P22(k) to obtain a regularized P22,reg(k). We first expand the kernel in Eq. (2.17)
in a Laurent series around small r:
928 4512 416 2656 r2Z(r) = 168 + r2 r4 + r6 + r8 + ... . (2.23) − 5 − 35 21 1155
If P13(k) were dominated by contributions from large-scale modes (i.e. r 1), as occurs when there is an infrared divergence, then we could make the replacement
2 r Z(r) 168 and find that P13(k) approaches → − 3 Z ∞ Z 3 168 k 1 2 d q Plin(q) P13(k) 2 Plin(k) drPlin(kr) = k Plin(k) 3 2 . (2.24) → −252(2π) 0 −3 (2π) q
We then subtract this off from the kernel Z(r) so that
168 12 10 2 3 2 2 3 r + 1 Zreg(r) = Z(r) + = + + 100 42r + (7r + 2)(r 1) log . r2 r4 r2 − r5 − r 1 | −(2.25)|
The regularized version of P13(k) is k3 Z ∞ P (k) = P (k) dr r2P (kr)Z (r) 13,reg 252(2π)2 lin lin reg 0 (2.26) 3 Z ∞ k 3s log k+s s = 2 Plin(k) ds e Plin(e )Zreg(e ) , 252(2π) −∞ which can be evaluated numerically in the same manner that was presented in 2.2.2. §
To regularize J2,−2,0(k) we take the power that we subtracted from P13(k)
2 Z 3 k d q Plin(q) ∆P (k) = P13(k) P13,reg(k) = P (k) , (2.27) − − 3 (2π)3 q2
43 and add it to J2,−2,0(k). To do this, we first take the Fourier transform of Eq. (2.27):
Z d3q 1 Z d3q Z d3q P (q ) 1 iq1·r 1 iq1·r 2 2 lin 2 ∆ξ(r) = 3 e ∆P (q1) = 3 e q1Plin(q1) 3 2 (2π) −3 (2π) (2π) q2 Z ∞ Z ∞ 1 4 = 4 dq1 q1Plin(q1)j0(q1r) dq2 Plin(q2) . −12π 0 0 (2.28)
1 Since J2,−2,0(r) appears in ξ22(r) with a factor of 3 – see Eq. (2.7) – it follows that
3∆ξ(r) should be added to J2,−2,0(r) if we want to preserve the sum P22(k) + P13(k) in the regularization process. This leads to a regularized J2,−2,0(r):
J[2,−2,0 reg](r) = J2,−2,0(r) + 3∆ξ(r) Z ∞ Z ∞ 1 4 = 4 dq1 q1Plin(q1)j0(q1r) dq2 Plin(q2)[j0(q2r) 1] . 4π 0 0 − (2.29)
The left bracket of Eq. (2.29) proceeds in the same manner as presented in 2.2.1. § The right bracket in Eq. (2.29) requires some additional work: Z ∞ I−2,0,reg = dq2 Plin(q2)[j0(q2r) 1] 0 − N/2 Z ∞ X ν+iηn sin(q2r) = cn dq2 q2 1 q2r − (2.30) n=−N/2 0 N/2 X −1−ν−iηn reg = cnr gn , n=−N/2 where the integral may be evaluated by substituting z = kr and finding:
Z ∞ reg reg reg ν+iηn sin z reg πQn reg gn (Qn ) = dz z 1 = Γ(Qn ) sin = f(Qn ) (2.31) 0 z − 2
reg 6 reg reg and Qn = ν + iηn. The last equality uses Eq. (A.7), and ensures that gn (Qn ) can be evaluated using the same numerical machinery used for the Jαβl(k) integrals. The
6This integral is valid for its range of convergence, 3 < ν < 1. A straightforward way to prove this is to insert a factor of e−z, with small and positive,− into the− integrand; then expanding sin z = iz −iz reg reg (e + e )/(2i) leads to a sum of three Γ-functions, two with Γ(Qn ) and one with Γ(Qn + 1). Taking the limit of 0+ causes the latter to drop out and the remaining two to give Eq. (2.31). → 44 final result for J[2,−2,0,reg](k) is completely analogous to the method in 2.2.1, with §
reg Qh Q−2,n the only exception that gn is replaced by gn and the factor 2 is replaced by 2
in Eq. (2.15). FAST-PT allows the user to specify which case is desired.
2.3 Performance
We now discuss the results from the FAST-PT algorithm. Unless otherwise noted,
results are based on the input linear power spectrum generated by the Boltzman
solver CAMB (Lewis et al., 2000), assuming a flat ΛCDM cosmology corresponding
to the recent Planck results (Planck Collaboration et al., 2016b). Timing results were
obtained on a MacBook Pro Retina laptop computer, with a 2.5 GHz Intel Core i5
processor and running OS X version 10.10.3. We used Python version 2.7.10, numpy
1.8.2, and scipy 0.15.1.
2.3.1 1-loop Results
To test our method we evaluated the 1-loop SPT correction to the power spectrum,
P22(k)+P13(k). We sample the power spectrum for 3000 k-points from log kmin = 4 10 −
to log10 kmax = 2 and we additionally pad our input signal with 500 zeros at both
ends of the array. A typical run for a sample of this size takes FAST-PT a total time
0.02 seconds on a laptop. We recommend that FAST-PT users sample the input ∼ power spectrum on a grid larger than desired and then trim the output to the desired
range to avoid wrapping effects. We take this approach and present our results on
a grid from kmin = 0.003 to kmax = 50. The top panel Fig. 2.2 plots our FAST-
PT results, while the bottom panel plots the ratio of our FAST-PT calculations to a
45 conventional method.7 We observe that the 1-loop power spectrumFAST-PT agrees
with the conventional method to high precision. The noise observed in the bottom
panel of Fig. 2.2 is due to noise in the input power spectrum from CAMB; any
integration method must interpolate this noise, and this results in noise in the output
spectrum P22 + P13 which differs depending on the method. At high k, the noise
in P22 + P13 is larger than (and of opposite sign to) the noise in Plin(k), which is a phenomenon common to diffusion problems and is the correct mathematical solution to SPT, where re-normalization or re-summation techniques are not used (see 2.3.2). § The sharp spike around k = 0.1h/Mpc is due to the zero crossing of the 1-loop power spectrum, where ratios of corrections suffer from a “0/0” ambiguity. We conclude that differences between FAST-PT results and those from our conventional method are negligible on the scales of interest.
Fig. 2.3 plots estimated run time versus grid size. A solid black line in the left panel plots the average recurring time (i.e. the time of execution after initialization of the FAST-PT class) for 1500 runs. The grey band covers the area enclosed by ± one standard deviation. The right panel plots the average initialization time for 1500
runs, i.e. the time to initialize the FAST-PT python-class and evaluate all functions
that only depend on grid size (for example gαn). The total time for one one-loop
evaluation is the addition of the black line in the right and left panels. Run time can
vary across machines, so Fig. 2.3 serves only as an estimate.
7 The “conventional” method is a fixed-grid 2D integration code. Here P22 was computed by putting k on the z-axis and writing q in cylindrical coordinates. The azimuthal integral is trivial. We sample the integrand logarithmically in the radial direction q⊥, and stretch the vertical direction according to qz/k = 1+sinh(20υ)/[2 sinh(20)], with υ > 1. This samples half of space (so the result must be doubled) and by uniformly sampling in υ, it places− higher resolution near q k, which ≈ is important to correctly sample the contribution to P22 from advection by very long-wavelength modes. The P13 integral was log-sampled in r.
46 3
] 3 h
/ 10 c
p 102 M [
1 ) 10 k (
3 0 1 10 P -1 + 10 ) k
( -2
2 10 2 P 1.010
1.005
1.000
0.995 ratio to conventional method 0.990 0.01 0.10 1.00 10.00 k [h/Mpc]
Figure 2.2: FAST-PT 1-loop power spectrum results versus those computed using a conventional fixed-grid method. The top panel shows FAST-PT results for P22(k) + P13(k) (the dashed line is for negative values). The bottom panel plots the ratio between FAST-PT and the conventional method.
47 0.014 0.07 average of execution time average time to initialize
0.012 0.06
0.010 0.05
0.008 0.04
0.006 0.03
0.004 0.02 time [seconds]
0.002 0.01
0.000 0.00 500 1000 1500 2000 2500 3000 500 1000 1500 2000 2500 3000 number of grid points number of grid points
Figure 2.3: Estimate of FAST-PT execution time to number of grid points scaling. The left panel plots the average one-loop evaluation time, after initialization of the FAST-PT class. The right panel plots the average time required for initialization of FAST-PT class for 1500 runs. For a sample of grid points, the error is computed by taking the standard deviation of 1500 runs.
48 2.3.2 Renormalization Group Flow
The renormalization group (RGPT) method of McDonald (2007, 2014) provides a
more accurate model for the power spectrum than SPT (Carlson et al., 2009; Widrow
et al., 2009; Orban & Weinberg, 2011), providing significant improvement to both the
structure of the BAO feature and the broadband power at smaller scales (higher k).
The RG evolution equation is
dP (k, λ) = G [P (k, λ),P (k, λ)] , (2.32) dλ 2
where G2[P,P ] is the standard 1-loop correction to the power spectrum, i.e. P22(k) +
P13(k) with the caveat that the input power spectrum need not be the linear power
spectrum. The parameter λ is a “coupling” strength parameter proportional to the
growth factor squared. One can imagine that Eq. (2.32) represents a time-evolution
of the power spectrum (in an Einstein-deSitter universe) starting at P (k, λ = 0) =
Plin(k), moving forward in time by a small step, using perturbation theory to up-
date the power spectrum, and then using the updated power spectrum as the initial
condition for the next step, iterating until one reaches λ = 1.
However, despite the potential advantages of the RG approach, it can be quite
numerically intensive. Eq. (2.32) is a stiff equation and becomes unstable when the
integration step size is too large, and it requires an evaluation of the 1-loop SPT
kernel at every step. Conventional computational methods are thus extremely time
consuming. The speed of FAST-PT makes this calculation significantly more feasible.
We have compared our RG flow results with those obtained from the Copter code
(Carlson et al., 2009; Carlson, 2013), a publicly available code written in C++. We have found that for RG flow our code can obtain results in substantially less time.
49 For instance, on a 200 point grid, from kmin = 0.01 to kmax = 10, our FAST-PT RG
flow results take 5 seconds, while Copter RG flow results take over 5 minutes. ∼ In Appendix A.4 we explain our integration routine, as well as document RG-flow run times for various grid sizes. A FAST-PT user must consider the stiff nature of
Eq. (2.32) when choosing a step size for the integration; we recommend that they consult Appendix A.4.
The left hand panel of Fig. 2.4 shows our renormalization group and SPT results compared to linear theory. In our analysis we performed two renormalization group
−1 −1 runs: one to kmax = 5 hMpc and another to kmax = 50 hMpc . Our results are consistent with the plots found in McDonald (2007) (note that in our runs we include the BAO feature). The right hand panel of Fig. 2.4 plots the effective power law index as a function of k, neff = d log P/d log k. Here we see two characteristic features of
RG evolution, the damping of the BAO and neff approaching a fixed point value of
1.4. ∼ − Figure 2.5 shows the effect of a boundary condition within our numerical algo- rithm. We integrate Eq. (2.32) for some kmin and kmax. The kmax boundary does not allow for power to continuously flow from larger to smaller scales, as would occur for infinite boundary conditions. As a result power builds up at high-k causing the plateau observed in left panel of Fig. 2.5. The right panel of Fig. 2.5 shows nneff .
We do see that before the onset of the plateau neff does approach 1.4 and this ∼ − designates a region where RG results at finite kmax reproduce the asymptotic behavior as kmax . → ∞ To qualify the accuracy of the RG method in the weakly non-linear regime, we also plot results from the FrakenEmu emulator (in Fig 2.5), which is based of the Coyote
50 Universe simulations (Heitmann et al., 2014). In the vicinity of k 0.1, it is observed ∼ that RG methods better follow the fully non-linear results of the Coyote Universe.
Figure 2.5 also shows another interesting feature, the removal of noise in the
RG-framework. As mentioned earlier, linear power spectrum generated by CAMB contains low-level noise. This noise is most easily visualized through a derivative, for instance neff . One can see that neff for the linear power spectrum in Fig. 2.5 is noisy, particularly at large k. Under the RG evolution this noise is washed away, as seen in the RG neff results. This is a result of the fact that noise in Plin(k) results in“negative” noise features in P13(k). Under the RG flow, this feature causes noise initially present to be smeared away in the nonlinear regime. This is also what happens in the real universe, since features in the power spectrum at small ∆k correspond to correlations at large real-space scales 2π/∆k, which are smeared out by advection; this effect ∼ is responsible for the familiar BAO peak smearing (Seo & Eisenstein, 2007).
2.4 Summary
In this paper we have introduced FAST-PT, an algorithm (and associated public code) that quickly evaluates convolution integrals in cosmological perturbation theory.
The code is modular and written in a high-level language (Python), and it is extremely fast due to algorithmic improvements. The keys to the method are locality (expressing the Fourier-space mode coupling integrals Jαβl as a product of correlation functions in configuration space); scale independence of the physics of gravity (”hence the utility of a power-law decomposition for the power spectrum); and the FFT (which enables log-spaced data to be converted into a superposition of power laws and vice versa).
The recurring cost of the 1-loop SPT calculations is presented in Fig. 2.3; for a linear
51 −1 Figure 2.4: FAST-PT Renormalization group results for kmax = 5, 50 hMpc . Left panel shows Renormalization group results and SPT results compared{ } to the linear power spectrum (see legend in right panel). Right panel shows neff = d log P/d log k for Renormalization group, SPT, and linear theory.
52 0.5 Renormalization Group Coyote Universe 104 0.0 one-loop Linear 0.5
103
) 1.0 ) k k ( f ( f e P 2 1.5
10 n
2.0
101
2.5
100 3.0 0.10 1.00 10.00 0.100 1.000 10.000 k [h/Mpc] k [h/Mpc]
Figure 2.5: Renormalization group results compared to standard 1-loop calcula- tions and those taken from the Coyote Universe. Left panel plots power spec- tra. A plateau at high-k develops due to boundary conditions. Right panel shows neff (k) = d log P/d log k.
53 power spectrum sampled on a 3000-point grid, one can expect to obtain results in
0.01 seconds. The time for RG results in tabulated in Tables A.1 and A.2. For ∼ a linear power spectrum sampled on 500-point grid from kmin = 0.001 to kmax = 10,
RG results are obtained in a few seconds.
We have demonstrated FAST-PT in the context of 1-loop SPT and the RG flow.
However, similar convolution integrals appear in numerous contexts, making both the
conceptual improvements behind FAST-PT and the code itself an efficient and flexible
tool for the community. For instance, the structure of the 1-loop SPT calculation
contains all elements necessary for nonlinear biasing in the galaxy-galaxy and galaxy-
matter power spectra. Furthermore, in follow-up work we are extending the technique
to other problems in cosmological perturbation theory. In Fang et al. (2017), we
generalize FAST-PT to “tensor” quantities (broadly defined as those that with explicit
dependence on the line-of-sight), including those relevant to the intrinsic alignments of
galaxies (e.g., Hirata & Seljak, 2004; Blazek et al., 2015), redshift space distortions,
and CMB anisotropies. We are additionally exploring further applications of the
FAST-PT framework. For instance, when the evolution of fluctuation modes is given
by a scale-dependent propagator, the time- and scale-dependence of each mode can
no longer be separated (Audren & Lesgourgues, 2011). Such a scenario arises in
the presence of massive neutrinos, where growth of structure is suppressed on small
scales due to free-streaming (Saito et al., 2008, 2009; Blas et al., 2014). Solving
for nonlinear evolution in such a scenario can be done using a time-flow approach
(Pietroni, 2008), requiring many evaluations of mode-coupling integrals. It is similar
to the RG flow described above, but with additional complications, particularly due
to the scale dependence of the propagators (note that our Jαβl integrals include only
54 power-law dependences on the magnitudes of q1 and q2), and the fact that the Green’s function solution for the bispectrum (needed to reduce the power spectrum solution to a mode-coupling integral) involves products of power spectra at unequal times. We are investigating the extent to which these issues can be treated in FAST-PT. Additionally we are exploring the applicability of the FAST-PT method to 2-loop calculations. Fast methods to compute the power spectrum past 1-loop order already exist (Crocce et al.,
2012; Taruya et al., 2012). These methods rely on multi-point propagator techniques.
We are working to determine whether FAST-PT-like algorithms can be extended to the 2-loop convolution integrals with computation time comparable to that obtained here for the 1-loop case.
The value of FAST-PT lies in its short execution time and the general applicability of these mode coupling integrals to cosmological observables. Additionally the modular structure of FAST-PT makes it easily integrable into cosmological analysis projects, for example those found in Eifler et al. (2014); Zuntz et al. (2015); Krause & Eifler (2017).
Our Python code is publicly available at https://github.com/JoeMcEwen/FAST-PT and includes a user manual. We also provide Python scripts to reproduce 1-loop power spectrum, galaxy bias power spectrum, renormalization group results, and animations for renormalization group results.
55 Chapter 3: FAST-PT II: an algorithm to calculate convolution integrals of general tensor quantities in cosmological perturbation theory
In this chapter I will present the full content of our second FAST-PT paper Fang et al. (2017). The original abstract is given below:
Cosmological perturbation theory is a powerful tool to predict the statistics of • large-scale structure in the weakly non-linear regime, but even at 1-loop or-
der it results in computationally expensive mode-coupling integrals. Here we
present a fast algorithm for computing 1-loop power spectra of quantities that
depend on the observer’s orientation, thereby generalizing the FAST-PT frame-
work (McEwen et al., 2016) that was originally developed for scalars such as the
matter density. This algorithm works for an arbitrary input power spectrum
and substantially reduces the time required for numerical evaluation. We apply
the algorithm to four examples: intrinsic alignments of galaxies in the tidal
torque model; the Ostriker-Vishniac effect; the secondary CMB polarization
due to baryon flows; and the 1-loop matter power spectrum in redshift space.
Code implementing this algorithm and these applications is publicly available
at https://github.com/JoeMcEwen/FAST-PT.
The orginal authors are: X. Fang, J. Blazek, J. McEwen, C. Hirata.
56 3.1 Introduction
Observational cosmology has entered a new era of precision measurement. Current and upcoming surveys (Levi et al., 2013; Dawson et al., 2013; Laureijs et al., 2011;
Spergel et al., 2013; Dark Energy Survey Collaboration et al., 2016) are enabling us to probe large-scale structure in more detail and over larger volumes, and hence to better constrain the underlying cosmological model. A parallel effort is underway to understand the astrophysical effects that are both signals and contaminants in these measurements. For example, weak gravitational lensing has become a powerful and direct probe of the dark matter distribution (Bartelmann & Schneider, 2001; Mel- lier, 1999), but it also suffers from systematic uncertainties, such as galaxy intrinsic alignments (IA), which must be mitigated in order to make use of high-precision mea- surements. Similarly, connecting observable tracers (e.g., in spectroscopic surveys) with the underlying dark matter requires a description of the bias relationship (Seljak et al., 2005; McDonald, 2006; McDonald & Roy, 2009; Baldauf et al., 2011; Seljak,
2012) and the effect of redshift-space distortions (RSDs, Kaiser, 1987; Scoccimarro,
2004; Taruya et al., 2010). Developments in CMB measurements provide another illustration, as the range of observables has expanded from early initial detections of temperature anisotropies by COBE (Strukov et al., 1992; Smoot et al., 1992; Kovac et al., 2002; Readhead et al., 2004; Bennett et al., 2013; Crites et al., 2015; Naess et al., 2014; Ade et al., 2014; BICEP2/Keck Collaboration et al., 2015). Current and future measurements (Planck Collaboration et al., 2016a; Kogut et al., 2011; Bock et al., 2009; Lazear et al., 2014; PRISM Collaboration et al., 2013; Andr´eet al., 2014) will be able to investigate more subtle effects, such as the kinetic Sunyaev-Zel’dovich
57 (kSZ, Sunyaev & Zeldovich, 1972; Carlstrom et al., 2002) and CMB spectral distor-
tions (Chluba & Sunyaev, 2012; Khatri & Sunyaev, 2012).
While modern cosmology has advanced significantly using our understanding from
linear perturbation theory, nonlinear contributions become significant at late times
and at smaller scales. In the quasi-linear regime, many relevant cosmological observ-
ables are usefully described using perturbation theory at higher order. Significant
effort has been devoted to understanding structure formation via a range of pertur-
bative techniques (e.g., Bernardeau et al., 2002; Sugiyama, 2014; Crocce & Scocci- marro, 2006; McDonald, 2007, 2014; Audren & Lesgourgues, 2011; Baumann et al.,
2012; Carrasco et al., 2012; Pajer & Zaldarriaga, 2013; Hertzberg, 2014; Blas et al.,
2016). In this work, we consider integrals in standard perturbation theory (SPT), although the methods and code we develop have a broader range of applications.
The next-to-leading-order (“1-loop”) corrections in these perturbative expansions are typically expressed as two-dimensional mode-coupling convolution integrals, which are generically time consuming to evaluate numerically. Recent algorithmic develop- ments have dramatically sped up these computations for scalar quantities – those
with no dependence on the direction of the observer, such as the matter density or
real-space galaxy density. The new algorithms (McEwen et al., 2016; Schmittfull
et al., 2016) take advantage of the locality of evolution in perturbation theory, the
scale invariance of cold dark matter (CDM) structure formation, and the Fast Fourier
Transform (FFT); and work is underway to apply them to 2-loop power spectra as
well (Schmittfull & Vlah, 2016). In a previous paper, we introduced the FAST-PT
implementation of these methods in Python (McEwen et al., 2016).
58 However, there are many interesting 1-loop convolution integrals for tensor quan-
tities – those with explicit dependence on the observer line of sight, such as those
arising for redshift-space distortions. In this case, we need convolution integrals with
“tensor” kernels:8
Z 3 d q1 ˆ ˆ I(k) = K(qˆ1 qˆ2, qˆ1 k, qˆ2 k, q1, q2)P (q1)P (q2) , (3.1) (2π)3 · · ·
ˆ ˆ where K(qˆ1 qˆ2, qˆ1 k, qˆ2 k, q1, q2) is a tensor mode-coupling kernel, k = q1 + q2, · · · k = k , and P (q) is the input signal – typically the linear matter power spectrum – | | logarithmically sampled in q. Due to the dependence on the direction of k, the de-
composition of these kernels is more complicated than in the scalar case. In this work,
we generalize our FAST-PT algorithm to evaluate these tensor convolution integrals,
achieving (N log N) performance as in the scalar case. O This paper is organized as follows: in 3.2 we provide the mathematical basis § for our method ( 3.2.1), introduce our algorithm ( 3.2.2), and discuss divergences § § that may arise and how they are resolved ( 3.2.3). In section 3.3 we apply our § § method to several examples: the quadratic intrinsic alignment model ( 3.3.1); the § Ostriker-Vishniac effect ( 3.3.2); the kinetic polarization of CMB ( 3.3.3); and the 1- § § loop redshift-space power spectrum ( 3.3.4). Section 3.4 summarizes the results. An § § appendix contains derivations of the relevant mathematical identities. The Python
code implementing this algorithm and the examples presented in this paper is publicly
available at https://github.com/JoeMcEwen/FAST-PT.
8The kernel K can be expressed as a sum of polynomials in the relevant dot products. “Tensor” refers to the general transformation properties of the cosmological quantities being considered under a symmetry operation – in this case, rotations in SO(3). For instance, the momentum density is a rank 1 tensor (a vector) while the IA field is a rank 2 tensor. The scalar case (rank 0) considered in McEwen et al. (2016) is thus a specific application of this more general framework.
59 3.2 Method
In this section we extend the FAST-PT framework to include the computation of convolution integrals with tensor kernels in the form of Eq. (3.1)
Our approach is similar to the scalar version of FAST-PT. We first expand the kernel into several Legendre polynomial products – the explicit dependence on the direction kˆ requires an expansion in three angles rather than one (as shown in Eq. 3.2 and
3.3). Second, products of Legendre polynomials are written in spherical harmonics using the addition theorem, where the required combinations of spherical harmonics are constrained by Wigner 3j symbols and preserve angular momentum (as in Eq.
3.4). Third, in configuration space, the integral of each term in the expansion can be further transformed into a product of several one-dimensional integrals (as in Eq.
3.15 and 3.16), which can be quickly performed by assuming a (biased) log-periodic power spectrum and employing FFTs (as in Eq. 3.19 and 3.23).
We will first provide the theory in 3.2.1 and then briefly introduce our algorithm § in 3.2.2. Finally, in 3.2.3 we will discuss physical divergence problems that can § § arise and the way to solve them through the choice of appropriate biasing of the log-periodic power spectrum.
3.2.1 Transformation To 1D Integrals
In general, the kernel function K can be decomposed as a summation of terms
ˆ ˆ X αβ ˆ ˆ α β K(qˆ1 qˆ2, qˆ1 k, qˆ2 k, q1, q2) = A `(qˆ1 qˆ2) ` (k qˆ2) ` (k qˆ1)q q , · · · `1`2`P · P 1 · P 2 · 1 2 `1,`2,`,α,β (3.2)
αβ where ` are the Legendre polynomials, and the A coefficients specify the com- P `1`2` ponents of a particular kernel. For general angular dependences the sum may require
60 an infinite number of terms. However the kernels that appear in CDM perturba- tion theory and galaxy biasing theory are composed of a finite number of terms in a polynomial expansion. This decomposition leads us to consider integrals of the form
Z 3 d q1 ˆ ˆ α β f(k) = `(qˆ1 qˆ2) ` (k qˆ2) ` (k qˆ1)q q P (q1)P (q2) . (3.3) (2π)3 P · P 1 · P 2 · 1 2 The product of Legendre polynomials can be decomposed into spherical harmonics by the addition theorem. Using the result presented in Appendix B.2.1, we can write the product of three Legendre polynomials in terms of spherical harmonics and Wigner
3j symbols:
ˆ ˆ `(qˆ1 qˆ2) ` (qˆ1 k) ` (qˆ2 k) P · P 2 · P 1 · J J J X J1J2Jk X ˆ 1 2 k = C` ` ` YJ1M1 (qˆ1)YJ2M2 (qˆ2)YJkMk (k) , (3.4) 1 2 M1 M2 Mk J1,J2,Jk M1,M2,Mk with coefficients given by
J1J2Jk 3/2 `1+`2+`p C =(4π) ( 1) (2J1 + 1)(2J2 + 1)(2Jk + 1) `1`2` − J ` ` ` J ` ` ` J J J J 1 2 1 2 1 2 k 1 2 k , (3.5) × 0 0 0 0 0 0 0 0 0 `1 `2 ` where we have used the 3j and 6j symbols, denoted by ( ) and , respectively. The {} integers M1,M2,Mk satisfy the selection rule M1 + M2 + Mk = 0. The coefficients
CJ1J2Jk map the product of spherical harmonics in Eq. (3.4), written in terms of the `1`2`
J1,J2,Jk basis, to the original `1, `2, ` basis of Legendre polynomials. Upon replacing the product of Legendre polynomials in Eq. (3.3) with Eq. (3.4) (omitting the coeffi- cients CJ1J2Jk ), we arrive at an integral over the product of three spherical harmonics, `1`2`
αβ which we will denote as IJ1J2Jk (k). For each combination of J1,J2,Jk, we have Z 3 αβ X d q1 ˆ α β J1 J2 Jk I (k) = P (q )P (q )YJ M (qˆ )YJ M (qˆ )YJ M (k)q q J1J2Jk 3 1 2 1 1 1 2 2 2 k k 1 2 (2π) M1 M2 Mk M1M2Mk X ˆ αβ YJ M (k)T (k) , (3.6) ≡ k k J1J2JkMk Mk
61 where we have defined
αβ X J1 J2 Jk αβ TJ J J M (k) HJ M J M (k) and (3.7) 1 2 k k ≡ M1 M2 Mk 1 1 2 2 M1M2 Z 3 αβ d q1 α β H (k) P (q1)P (q2)YJ M (qˆ1)YJ M (qˆ2)q q . (3.8) J1M1J2M2 ≡ (2π)3 1 1 2 2 1 2
We can separate Hαβ (k) into a product of two integrals, respectively over q J1M1J2M2 1 and q2, by Fourier transforming to configuration space
Z d3q d3q αβ 1 2 i(q1+q2)·r α β H (r) = e q q P (q1)P (q2)YJ M (qˆ1)YJ M (qˆ2) J1M1J2M2 (2π)3 (2π)3 1 2 1 1 2 2 =H¯ αβ (r)Y (rˆ)Y (rˆ) , (3.9) J1J2 J1M1 J2M2 where we have used the plane wave expansion (Eq. B.5) together with orthogonality relations (Eq. B.3) to arrive at the equality. We have also defined
∞ ∞ (4π)2iJ1+J2 Z Z H¯ αβ (r) dq q2+αP (q )j (q r) dq q2+βP (q )j (q r) , (3.10) J1J2 6 1 1 1 J1 1 2 2 2 J2 2 ≡ (2π) 0 0
where jJ (qr) are the spherical Bessel functions. Substituting Eq. (3.9) into the defi-
αβ nition of TJ1J2JkMk we obtain αβ X J1 J2 Jk αβ TJ J J M (r) = HJ M J M (r) 1 2 k k M1 M2 Mk 1 1 2 2 M1M2 ¯ αβ X J1 J2 Jk = HJ J (r) YJ1M1 (rˆ)YJ2M2 (rˆ) 1 2 M1 M2 Mk M1M2 ¯ αβ ∗ = H (r)aJ J J Y (rˆ) , (3.11) J1J2 1 2 k JkMk where s (2J1 + 1)(2J2 + 1) J1 J2 Jk aJ1J2Jk . (3.12) ≡ 4π(2Jk + 1) 0 0 0
62 The derivation of Eqs. (3.11) and (3.12) is provided in Appendix (B.2.2). Fourier transforming back to k-space, we obtain Z T αβ (k) = d3rT αβ (r)e−ik·r J1J2JkMk J1J2JkMk Z Z 2 ¯ αβ 2 ∗ −ik·r =aJ J J r drH (r) d rˆY (rˆ)e 1 2 k J1J2 JkMk Z Z 2 ¯ αβ 2 ∗ X `0 ∗ ˆ =aJ J J r drH (r) d rˆY (rˆ)4π ( i) j`0 (kr)Y 0 0 (k)Y`0m0 (rˆ) 1 2 k J1J2 JkMk − ` m `0m0 Z 2 ¯ αβ X `0 ∗ ˆ =aJ J J r drH (r)4π ( i) j`0 (kr)Y 0 0 (k)δ`0J δm0M 1 2 k J1J2 − ` m k k `0m0 Z Jk 2 ¯ αβ ∗ ˆ =4π( i) aJ J J r drH (r)jJ (kr)Y (k) , (3.13) − 1 2 k J1J2 k JkMk where in the third equality we have used the plane wave expansion (Eq. B.5), and in the fourth equality used the orthogonality relation between spherical harmonics
(Eq. B.3). Combining the results from Eq. (3.10), (3.13), (3.12), we arrive at
Z X αβ Jk 2 ¯ αβ ˆ ∗ ˆ I (k) = 4π( i) aJ J J r drH (r)jJ (kr) YJ M (k)Y (k) J1J2Jk − 1 2 k J1J2 k k k JkMk Mk Z Jk 2 ¯ αβ = ( i) (2Jk + 1)aJ J J r drH (r)jJ (kr) − 1 2 k J1J2 k r (2J1 + 1)(2J2 + 1)(2Jk + 1) J J J = ( 1)Jk+(J1+J2+Jk)/2 1 2 k − 64π9 0 0 0 Z r2drJ αβ (r)j (kr) , (3.14) × J1J2 Jk
αβ where J1 + J2 + Jk must be even for the 3j symbol to be non-zero, and JJ1J2 (r) is defined by Z ∞ Z ∞ J αβ (r) dq q2+αP (q )j (q r) dq q2+βP (q )j (q r) . (3.15) J1J2 1 1 1 J1 1 2 2 2 J2 2 ≡ 0 0 Combining Eq. (3.14) and (3.4) we can rewrite the integral (3.3) as
Z 3 d q1 ˆ ˆ α β `(qˆ1 qˆ2) ` (k qˆ2) ` (k qˆ1)q q P (q1)P (q2) (2π)3 P · P 1 · P 2 · 1 2 Z X J1J2Jk αβ X J1J2Jk 2 αβ = C I (k) = B r drJ (r)jJ (kr) , (3.16) `1`2` J1J2Jk `1`2` J1J2 k J1,J2,Jk J1,J2,Jk
63 where the coefficients BJ1J2Jk are given by `1`2` r (2J1 + 1)(2J2 + 1)(2Jk + 1) J J J BJ1J2Jk CJ1J2Jk ( 1)Jk+(J1+J2+Jk)/2 1 2 k `1`2` ≡ `1`2` − 64π9 0 0 0
J +J +J `+ 1 2 k (2J1 + 1)(2J2 + 1)(2Jk + 1) =( 1) 2 − × π3 J ` ` ` J ` ` ` J J J J J J J 1 2 1 2 1 2 k 1 2 k 1 2 k . × 0 0 0 0 0 0 0 0 0 0 0 0 `1 `2 ` (3.17)
αβ The evaluation of JJ1J2 (r) is similar to the analogous quantity in scalar FAST-PT. For notational simplicity, we define the last integral in Eq. (3.16) as Z αβ 2 αβ (k) = r drJ (r)jJ (kr) . (3.18) JJ1J2Jk J1J2 k
Eq. (3.18) is similar in structure to Eq. (2.19) of McEwen et al. (2016). As such,
we can easily generalize the FAST-PT framework to evaluate integrals in the form of
Eq. (3.18).
Note that some (scalar) 2-loop integrals have similar structure to the tensor 1-loop
integrals considered here. In recent work, Schmittfull & Vlah (2016) employed similar
techniques involving Wigner 6j symbols to deal with these 2-loop integrals, although
the implementations are somewhat different.
3.2.2 Algorithm Implementation For αβ (k) Integral JJ1J2Jk
We adopt the discrete Fourier transformation of the power spectrum as discussed
in the first FAST-PT paper McEwen et al. (2016),
N/2 N−1 P (k ) X q −2πimq/N X ν1+iηm cm = Wm ν1 e Pfiltered(kq) = cmkq , (3.19) kq → q=0 m=−N/2 where N is the size of the input power spectrm, ηm = m 2π/(N∆), m = N/2, N/2+ × − −
1, ..., N/2 1,N/2, ν1 is the bias index, and ∆ is the linear spacing, i.e., kq = − 64 0 k0 exp(q∆) with k0 being the smallest value in the k array. Similarly, cn are the Fourier
coefficients of the power spectrum with bias index ν2. The physics of the bias has
been discussed in McEwen et al. (2016)9 and the choice of its value will be discussed in
∗ 0∗ 0 3.2.3. For a real power spectrum the Fourier coefficients obey c = c−m, c = c . § m n −n 10 Wm is a window function used to smooth the edges of the Fourier coefficient array
of the biased power spectrum (e.g., from the cutoffs in k), hence smoothing over the
noise and sharp features in the power spectrum, as well as prevent them from prop-
agating non-locally in the “filtered” power spectrum. The “filtered” power spectrum
is then treated as the input power spectrum and its cm’s are used for calculations
afterwards. Following Eq. (2.17) in McEwen et al. (2016), we can write Eq. (3.15)
as11
N/2 N/2 π αβ X X 0 Qαm+Qβn −6−ν1−ν2−α−β−iηm−iηn J (r) = cmc gαmgβn2 r , (3.20) J1J2 2 n m=−N/2 n=−N/2
1 1 3 where gαm g(J1 + ,Qαm), gβn g(J2 + ,Qβn), Qαm + ν1 + α + iηm, Qβn ≡ 2 ≡ 2 ≡ 2 ≡ 3 2 + ν2 + β + iηn, and Γ[(µ + κ + 1)/2] g(µ, κ) . (3.21) ≡ Γ[(µ κ + 1)/2] − 9The bias is introduced to solve the numerical divergences arising from the Fourier transform. By performing the Fourier transform, we assume the input power spectrum to be periodic, so that there are infinite “satellite” power spectra on both low and high k sides. To avoid infinite contribution from the satellites, appropriate bias values are required. 10The window function we use is a smoothing function described in Appendix C of McEwen et al. (2016). 11The major step is substituting the expansions of the power spectra into Eq. (3.15), and utilizing R ∞ κ κ the formula: dt t Jµ(t) = 2 g(µ, κ) for κ < 1/2, (κ + µ) > 1, where the Bessel function of 0 < < − p the first kind Jµ is related to the spherical Bessel function by Jµ(t) = 2t/πjµ−1/2(t), and g(µ, κ) is defined in Eq. (3.21).
65 The integral then becomes
Z ∞ αβ 2 αβ (k ) dr r J (r)jJ (k r) J1J2Jk q J1J2 k q J ≡ 0 N/2 N/2 π X X Z ∞ = c g c0 g 2Qαm+Qβn dr j (k r)r−4−ν1−ν2−α−β−iηm−iηn 2 m αm n βn Jk q m=−N/2 n=−N/2 0 N/2 N/2 π Z ∞ X X 0 Qαm+Qβn Qαm+Qβn −4−ν1−ν2−α−β−iηm−iηn = c g c g 2 kq dr j (r)r 2 m αm n βn Jk m=−N/2 n=−N/2 0
3 N/2 N/2 π 2 X X Qαm+Q = c g c0 g 2Qαm+Qβn k βn 2 m αm n βn q m=−N/2 n=−N/2 ∞ Z 9 − −ν1−ν2−α−β−iηm−iηn dr J 1 (r)r 2 Jk+ 2 × 0
3 N/2 N/2 π 2 X X Qαm+Q = c g c0 g 2Qαm+Qβn k βn 2 m αm n βn q m=−N/2 n=−N/2 9 1 9 − −ν1−ν2−α−β−iηm−iηn 2 2 g Jk + , ν1 ν2 α β iηm iηn × 2 −2 − − − − − − 3/2 N/2 N/2 π X X 0 Qαm+Qβn 1 9 = cmgαmc gβnkq g Jk + , ν1 ν2 α β iηm iηn . 8 n 2 −2 − − − − − − m=−N/2 n=−N/2 (3.22)
We define τh ηm + ηn and Qh Qαm + Qβn, which only depends on the sum m + n. ≡ ≡ We write the double summation over m and n as a discrete convolution, indexed by h, such that h = m + n = N, N + 1, ,N 1,N . This leads to {− − ··· − } N/2 N/2 π3/2 1 3 αβ X X 0 Qh (kq) = cmgαmc gβnk g Jk + , Qh JJ1J2Jk 8 n q 2 − − 2 m=−N/2 n=−N/2 π3/2 1 3 X 0 Qh = [cmgαm c gβn]hk g Jk + , Qh 8 ⊗ n q 2 − − 2 h π3/2 1 3 3+ν1+ν2+α+β X = k Ch exp(iτh ln k0) exp(iτhq∆)g Jk + , Qh 8 q 2 − − 2 h π3/2 1 3 3+ν1+ν2+α+β = k IFFT Ch g Jk + , Qh , (3.23) 8 q 2 − − 2
66 where Ch is defined as the convolution in the second equality, and IFFT is the discrete inverse Fast Fourier Transform. This derivation is similar to Eq. (2.21) in McEwen et al. (2016).
In the algorithm, for each set of (J1,J2,Jk) there are 3 FFT operations and 1 convoluton. In our public code, we use the scipy.signal.fftconvolve routine Jones et al. (01 ) to perform the convolution, which uses the convolution theorem, resulting in 3 additional FFT operations. Thus, for each set of (J1,J2,Jk) there are 6 FFT operations executed in total.
Summary of the Algorithm
From Eq (3.16), the tensor convolution integral (3.1) can be decomposed as
X X Z I(k) = Aαβ BJ1J2Jk r2drJ αβ (r)j (kr) . (3.24) `1`2` `1`2` J1J2 Jk `1,`2,`,α,β J1,J2,Jk Our algorithm is thus as follows:
1. Given an integral in the form of Eq. (3.1), expand it in terms of Eq. (3.3) to
obtain all the non-zero coefficients Aαβ ; `1`2`
2. For each combination of `1, `2, `, use Eq. (3.17) to calculate all the possible com-
binations of J ,J ,J and their corresponding (non-zero) coefficients BJ1J2Jk ; 1 2 k `1`2`
αβ 3. For all the possible combinations of J1,J2,Jk, calculate JJ1J2 (r) and perform the Hankel transform integration (see 3.2.2 for the detailed implementation); §
4. Sum up all the terms to obtain the result.
The criteria for non-zero BJ1J2Jk can be obtained from the properties of the Wigner `1`2` 3j symbols. From Eq. (3.17) we have
`1 `2 Jk `1 + `2 , ` `2 J1 ` + `2 , ` `1 J2 ` + `1 , (3.25) | − | ≤ ≤ | − | ≤ ≤ | − | ≤ ≤ 67 J1 J2 Jk J1 + J2 , (3.26) | − | ≤ ≤ and
J1 + `2 + ` = even , `1 + J2 + ` = even , `1 + `2 + Jk = even . (3.27)
The condition that “J1 + J2 + Jk = even” is redundant since it can be infered from the conditions (Eq. 3.27).12.
3.2.3 Removing Possible Divergences
Note that the algorithm we have presented in this section is only for the “P22(k)”- type integrals, i.e., containing two power spectra P (q1)P (q2) in the integrand as in
Eq. (3.1). In 3.3.4 we will encounter integrals containing P (q1)P (k) or P (q2)P (k), § which can be reduced to one-dimensional integrals, analogous to P13(k) in 1-loop SPT
(for details on our algorithm of P22 and P13, see McEwen et al., 2016). We first focus on the P22(k)-type integrals, where two potential types of divergence may emerge in this algorithm.
Divergence From Kernel Expansions
When we expand the kernel into the Legendre polynomial form, the integral (3.3) can be divergent for some combinations of `, `1, `2, α, β, even though the sum of all terms will be convergent for physical observables. If the input power spectrum is the linear matter power spectrum Plin(k), for q1 k, q2 q1, and the power spectra, ≈ − −3 Plin(q1) and Plin(q2), both scale as q1 . Thus the integral (3.3) is proportional to R α+β−4 dq1 q for `1 = `2. Convergence requires that α + β < 3. For `1 = `2, this 1 6 constraint is relaxed due to suppression from the angular integral.
12 Summing up the three equations in Eq. (3.27) we have J1 + J2 + Jk + 2(`1 + `2 + `) = even, which leads to J1 + J2 + Jk = even.
68 ns neff (k) For q1 k, q2 k, so that Plin(q1) q and Plin(q2) k , where ns 1 is ≈ ∝ 1 ∝ ∼ the primordial spectral index of the matter power spectrum, and neff (k) is the effective
R α+ns+2 spectral index at k. The integral is then proportional to dq1 q1 , leading to the requirement: α > 3 ns for ` = `2. Similarly, for q2 small, we get β > 3 ns for − − − −
` = `1. As before, these constraints are relaxed if ` = `2 or ` = `1. 6 6 Violations of these criteria have to be removed by regularization, specifically can- celing the divergent parts. None of the examples in the next section have such a divergence (although see 3.3.4 for a discussion of a separate numerical divergence § which is treated analytically).
Divergence From Periodic Power Spectrum and Choice of Bias Indices
As discussed in McEwen et al. (2016), the use of FFTs enforces a periodic power spectrum which can lead to unphysical divergences for certain choices of the power- law bias. This generalized implementation of FAST-PT has more freedom in the choice of bias indices ν1, ν2, compared with the original “scalar” version. First, it allows the use of two different bias indices ν1, ν2 for the two input power spectra, instead of one
fixed ν. Second, it allows the bias indices to change for different Legendre integrals
(3.3). We now discuss our choice of ν1, ν2.
In FAST-PT, we expand the input power spectra Plin(q1),Plin(q2) into sums over
ν1+iηm ν2+iηn power-law spectra q1 and q2 . The real parts of the exponents, i.e., the bias indices ν1, ν2, will affect the convergence of the integrals.
Using a similar argument as in the previous subsection, for large q1, we will have
ν1 ν2 Plin(q1) q ,Plin(q2) q . Working out the integral, we end up with the criterion: ∝ 1 ∝ 1
ν1 +α +ν2 +β < 3 for `1 = `2. For small q1, we get α +ν1 > 3 for ` = `2; similarly − −
69 for small q2, we get β + ν2 > 3 for ` = `1. These constraints are relaxed if ` = `2 or − 6
` = `1. We plot the convergence region in Figure 3.1. 6
In our code, we take ν1 = 2 α and ν2 = 2 β for all cases to satisfy the above − − − − conditions. Note that the choice of different bias values for different components of a given observable is technically non-physical since the choice of bias specifies the properties of the “universe in which the calculation is done. However, if the input k- range (or zero-padding) is sufficient, this effect is negligible on scales of interest13. The
fixed biasing scheme (ν = 2) employed for scalar quantities in McEwen et al. (2016) − avoids this issue. However, because one component of P22 violates α + ν > 3 (for − ` = 0) under this fixed biasing, we required analytic regularization to enforce Galilean invariance and remove the formally infinite contribution to displacements from k 0 → modes. Those integrals can be performed using the new scheme without the analytic regularization, although in this case a larger input range in k (or additional zero- padding) is required for numerical convergence.
3.3 Applications
In this section we apply the FAST-PT tensor algorithm to several cosmological ap- plications: the quadratic intrinsic alignment model ( 3.3.1); the Ostriker-Vishniac § effect ( 3.3.2); the kinetic polarization of CMB ( 3.3.3); and the 1-loop redshift- § § space distortion power spectrum ( 3.3.4). In each subsection we first briefly review § 13In principle, different bias indices could lead to slightly different integral results due to contribu- tions from the periodic “satellite” power spectra. However, when the input k-range or zero-padding is sufficient, these artificial contributions become negligible. When the bias indices are chosen inside the convergence region in Fig.3.1, we can always find a sufficient k-range, while outside the region, there may be no sufficient range. To test the stability of the results, we compared the OV power spectrum (Eq. 3.36) obtained using the bias indices ν1 = 2 α, ν2 = 2 β to the result obtained − − − − with the indices ν1 = 2.5 α, ν2 = 2.5 β, and found that the maximum fractional difference over the range 0.003-10−h/Mpc− is less− than− 3 10−7. ×
70 β + ν2
-3 O α + ν1
-3
Figure 3.1: The convergence region of the bias indices ν1, ν2 is indicated by the shaded region.
71 the theory behind the application before expanding the relevant integral(s) into the
form of Eq. (3.3) and comparing the output for each case with the results from
conventional (and significantly slower) two-dimensional cubature integration. To
demonstrate the performance of the code, we provide this comparison out to high
wavenumbers (k = 10 h/Mpc). We caution that the underlying perturbative mod-
els are not applicable to the real Universe beyond the the mildly nonlinear regime
(k few 10−1 h/Mpc), even though FAST-PT can still accurately compute the per- ∼ × turbation theory integrals. We envision these examples both as results in and of themselves, and, more importantly, as reference material for other cosmologists who may want to compute 1-loop power spectra with their own kernels and convert them to FAST-PT format.
Our input linear power spectrum was generated by CAMB (Lewis et al., 2000), assuming a flat ΛCDM cosmology corresponding to the Planck 2015 results (Planck
Collaboration et al., 2016b). We used Python version 3.5.1, numpy 1.10.4, and scipy
0.17.0. The public code is also compatible with Python 2.
3.3.1 Quadratic Intrinsic Alignments Model Theory
Weak gravitational lensing has become one of the most promising probes of the
dark matter distribution (Dark Energy Survey Collaboration et al., 2016; Hildebrandt
et al., 2017). The observed shapes of galaxies are weakly distorted (“sheared”) by the
gravitational potential of the large-scale structure along the line of sight. Correlations
in observed shapes tell us about the projected matter distribution. However, weak
lensing suffers from several systematic effects, one of which is intrinsic correlations be-
tween galaxy ellipticities, known as “intrinsic alignments” (IA, Troxel & Ishak, 2015;
72 Joachimi et al., 2015). In the weak lensing regime, the intrinsic shapes of galaxies dominate the observed shapes (i.e., are much larger than the lensing shear contribu- tion). While the dominant uncorrelated component of intrinsic ellipticities does not affect the correlation of shapes beyond adding noise, the component correlating the ellipticity with the underlying tidal field can bias cosmological inference from weak lensing measurements (Krause et al., 2016). On the other hand, IA can also serve as a probe of the the cosmological density field as well as the astrophysics of galaxies and halos (Chisari & Dvorkin, 2013).
On large scales, there are two types of physically-motivated intrinsic galaxy align- ment models, the tidal (linear) and quadratic alignment models (Hirata & Seljak,
2004; Catelan et al., 2001). The tidal alignment model is based on the assumption that large-scale correlations in the intrinsic ellipticity field of triaxial elliptical galaxies are linearly related to fluctuations in the primordial gravitational tidal field in which the galaxy formed.14 In quadratic models (often referred to as “tidal torquing”), the observed ellipticity of spiral galaxies comes from the inclination of the disk with re- spect to the line of sight, and hence from the direction of its angular momentum.
In this scenario, the tidal field from the large-scale structure will both “spin-up” the galaxy as well as provide a torque, contributing to the mean intrinsic ellipticity at second order. In general, once nonlinear effects are included, both tidal alignment and tidal torquing models have contributions from mode coupling integrals of the form of Eq. 3.1 (Blazek et al., 2015). More generally, these models can be viewed as components in an “effective expansion” of IA (Blazek et al., 2017), analogous to treatments of galaxy biasing (McDonald & Roy, 2009).
14Similar results are obtained from assuming that intrinsic shapes are “instantaneously” set by the tidal field at the time of observation (see Blazek et al., 2015, for further discussion).
73 In the quadratic alignment model (Hirata & Seljak, 2004), the intrinsic alignment
E/B-mode power spectrum P (EE,BB)(k) contains a convolution integral in the form γ˜I of Z d3q P (EE,BB)(k) = 2 1 h2 (qˆ , qˆ )P (q )P (q ), (3.28) IA,quad (2π)3 (E,B) 1 2 lin 1 lin 2 where k = q1 + q2 and hE and hB are tensor kernels. If we choose the coordinate ˆ system such that k = zˆ and xˆ points to the observer, h(E,B) can be expressed as
hE(qˆ1, qˆ2) =hzz(qˆ1, qˆ2) hyy(qˆ1, qˆ2) − ˆ ˆ 1 ˆ 2 1 ˆ 2 =(qˆ1 qˆ2)(qˆ1 k)(qˆ2 k) (qˆ1 k) (qˆ2 k) · · · − 3 · − 3 · 1 2 1 2 (qˆ1 qˆ2)(qˆ1 yˆ)(qˆ2 yˆ) + (qˆ1 yˆ) + (qˆ2 yˆ) , (3.29) − · · · 3 · 3 ·
hB(qˆ1, qˆ2) =2hzy(qˆ1, qˆ2) ˆ 2 ˆ = (qˆ1 qˆ2)(qˆ2 k) (qˆ1 k) (qˆ1 yˆ) · · − 3 · · ˆ 2 ˆ + (qˆ1 qˆ2)(qˆ1 k) (qˆ2 k) (qˆ2 yˆ) (3.30) · · − 3 · ·
ˆ where we can see that h(E,B) have k dependence. We have made the Limber approx- imation in assuming that only modes transverse to the line of sight will contribute to observed correlations, hence nˆ = xˆ. Note that our choice of the coordinate system is different from the conventions in some previous work where zˆ is chosen to be along the line of sight. Because the integrand has an azimuthal symmetry around k, inde- pendent of the line-of-sight direction, it is more convenient to work in our coordinate system, although the final results do not depend on this choice.
Conversion to FAST-PT Format
In spherical coordinates, we have qˆi = (sin θi cos φi, sin θi sin φi, cos θi) for i = 1, 2.
Note that φ1 = φ2 π φ because q1 and q2 add up to k which is on the z axis. − ≡ − 74 We obtain
qˆ1 yˆ = sin θ1 sin φ, qˆ2 yˆ = sin θ2 sin φ, · · −
qˆ1 qˆ2 = cos θ1 cos θ2 sin θ1 sin θ2, · − ˆ ˆ 2 (qˆ1 yˆ)(qˆ2 yˆ) = qˆ1 qˆ2 (qˆ1 k)(qˆ2 k) sin φ, · · · − · · 2 2 ˆ 2 ˆ 2 2 (qˆ1 yˆ) + (qˆ2 yˆ) = 2 (qˆ1 k) (qˆ2 k) sin φ. (3.31) · · − · − ·
Now we can rewrite hE as
ˆ ˆ 2 2 2 hE(qˆ1, qˆ2) =(qˆ1 qˆ2)(qˆ1 k)(qˆ2 k)(1 + sin φ) (qˆ1 qˆ2) sin φ · · · − · 1 2 h ˆ 2 ˆ 2i 2 2 (1 + sin φ) (qˆ1 k) + (qˆ2 k) + sin φ − 3 · · 3 2 2 2 1 2 2 2 2 2 =µµ1µ2(1 + sin φ) µ sin φ (1 + sin φ)(µ + µ ) + sin φ, − − 3 1 2 3 (3.32) ˆ ˆ where we define µ qˆ1 qˆ2, µ1 qˆ2 k, µ2 qˆ1 k (following the convention where ≡ · ≡ · ≡ · each angle is labeled by the subscript for the opposite side in the triangle).
15 Taking square of hE and then averaging over φ, we obtain
1 1 3 7 7 19 h2 = µ2 + µ4 (µ2 + µ2) + µ2(µ2 + µ2) + (µ4 + µ4) E 6 − 2 8 − 18 1 2 12 1 2 72 1 2 7 7 3 19 3 3 19 2 2 19 2 2 2 + µµ1µ2 µ µ1µ2 µ(µ µ2 + µ1µ ) + µ µ + µ µ µ 6 − 4 − 12 1 2 36 1 2 8 1 2 X 00(E) = A `(µ) ` (µ1) ` (µ2) , (3.33) `1`2` P P 1 P 2 `1,`2,` `1≥`2
where we apply the symmetry between q1 and q2 and only keep terms with `1 `2. ≥ Similarly, we can write h2 kernel in the same form with coefficients A00(B). The B `1`2` coefficient of each term is listed in Table 3.1. Now each term has been expressed in
α β the required form of q q `(µ) ` (µ1) ` (µ2), with α = β = 0. 1 2 P P 1 P 2 15 2 1 R 2π 2 4 Averaging over the azimuthal angle, we have cos φ = 2π 0 dφ cos φ = 1/2, cos φ = 2π h 1 i h i 1 R 4 2n − 2 1 2π 0 dφ cos φ = 3/8. More generally, cos φ = π Γ(n + 2 )/Γ(n + 1) for any non-negative integer n, known as the Wallis formula. h i
75 ` ` ` A00(E) A00(B) 1 2 `1`2` `1`2` 0 0 0 16/81 41/405 − 2 0 713/1134 298/567 − 2 2 95/162 40/81 − 4 0 38/315 32/315 − 1 1 1 107/60 59/45 − 3 1 19/15 16/15 − 2 0 0 239/756 2/9 − 2 0 11/9 20/27 − 2 2 19/27 16/27 − 3 1 1 7/10 2/5 − 4 0 0 3/35 —
2 Table 3.1: The coefficient of each term in the Legendre polynomial expansion of hE 2 and hB kernels (without the factor of 2 in front of the integral Eq. 3.28). Due to symmetry, we need only keep terms with `1 `2 (multiplying the value by two where relevant). ≥
(EE,BB) In Figure 3.2, we show the FAST-PT result of PIA,quad (k) (Eq. 3.28) and the fractional difference comparing to the results from conventional methods. The plot shows excellent agreement between two methods, with fractional accuracy better than
3 10−5 up to k = 10 h/Mpc. × 3.3.2 Ostriker-Vishniac Effect Theory
After CMB photons leave the surface of last scattering, they can experience fur- ther interactions, leading to secondary anisotropies. One of the most important is re-scattering off of free electrons after reionization in which photons can be shifted to higher or lower frequencies due to motions of the electrons. The thermal Sunyaev-
Zel’dovich effect (tSZ) results from thermal motion of the electrons, usually in galaxy clusters as these are the hottest regions. Bulk hydrodynamic motions produce the
76 3 2 ] 10 h
1
[Mpc/ 10 ) k ( EE BB 0 P (k)
/ IA,quad quad 10 , EE
IA BB
P PIA,quad(k) 1 10−
3 10 5 × − 2 10 5 × − 1 10 5 × − 0 1 10 5 − × − 2 10 5 − × − fractional difference 3 10 5 − × − 0.01 0.10 1.00 10.00 k [h/Mpc]
(EE,BB) Figure 3.2: The FAST-PT result for the intrinsic alignment integrals PIA,quad (k) in Eq. (3.28) (upper panel) and the fractional difference compared to the conventional method (lower panel).
77 kinetic Sunyaev-Zel’dovich (kSZ) effect (in clusters) or the Ostriker-Vishniac (OV)
effect (in large-scale structure). In this section, we consider the second-order pertur-
bation theory analysis of the Ostriker-Vishniac effect.
The fractional temperature perturbation in the direction nˆ on the sky is given by
(Ostriker & Vishniac, 1986; Vishniac, 1987; Jaffe & Kamionkowski, 1998)
Z η0 Θ(nˆ) = dw g(w)nˆ q(w) , (3.34) − 0 · where q(w) [1 + δ(w)]v(w), v(w) is the bulk velocity at position w wnˆ at ≡ ≡ a comoving distance w (or a conformal time η0 w), g(w) is the visibility function − specifying the probability distribution for scattering from reionized electrons, given by g(w) = (dτ/dw)e−τ , and τ is the optical depth.
ΘΘ At 1-loop, the angular power spectrum of Θ produced by the OV effect, C`
(equivalent to Pp(κ) in Jaffe & Kamionkowski, 1998), requires the calculation of the
Vishniac power spectrum, which is a tensor convolution integral. In a flat Universe,
!2 Z η0 2 ˙ ΘΘ 1 (a(w)g(w)) DD C` = 2 2 S(`/w)dw , (3.35) 16π 0 w D0
where D and D0 are the growth factors at w and at present, respectively. Choosing
the same coordinate system as in the IA calculation above, i.e., zˆ = kˆ and xˆ pointing
to the observer, the integral is given by
Z 3 2 2 d q1 q1x q2x S(k) = 4π 3 2 + 2 Plin(q1)Plin(q2) , (3.36) (2π) q1 q2
which is consistent with Eq. (21) in Jaffe & Kamionkowski (1998). Our interest here
is in fast computation of S(k).
78 Conversion to FAST-PT Format
First noting that the integral S(k) is symmetric under the exchange q1 q2 and ↔
that q2x = q1x, we can expand Eq. (3.36) as − Z 3 2 2 2 d q1 2q1x 2q1x S(k) = 4π 3 4 2 2 Plin(q1)Plin(q2) . (3.37) (2π) q1 − q1q2
2 2 2 2 1 2 2 In the spherical coordinate system, q1x = q1 sin θ cos φ, which becomes 2 q1 sin θ after averaging over φ. The kernel is thus ˆ 2 1 1 2 1 1 1 (k qˆ1) 2 2 = [ 0(µ2) 2(µ2)] 2 2 , (3.38) − · q1 − q2 3 P − P q1 − q2 ˆ −2,0 −2,0 where µ2 k qˆ1 . There are 4 terms in this case: A = 2/3, A = 2/3, ≡ · 000 020 − A0,−2 = 2/3, A0,−2 = 2/3.16 000 − 020 In Figure 3.3, we show the FAST-PT result of S(k) integral (Eq. 3.36) and the
fractional difference from a conventional method. The plot shows excellent agreement
between two methods with accuracy better than 6 10−5 up to k = 10 h/Mpc. × 3.3.3 Kinetic polarization of the CMB Theory
The kSZ effect can induce a secondary linear polarization in the CMB via the
quadratic Doppler effect and Thomson scattering (Sunyaev & Zeldovich, 1980; Hu,
2000). Due to the motion of baryons, an isotropic CMB appears to have a quadrupole
anisotropy component in the rest frame of the scattering baryons, as seen from the
expansion p 2 1 vb 2 1 2 Θ = − 1 nˆ vb + (nˆ vb) vb , (3.39) 1 nˆ vb − ' · · − 2 − · 16 It is possible to write the integral S(k) in other forms without breaking the q1 q2 symmetry, 2 ↔ 2 1 1 q1 −4 e.g., to write the kernel as 1 µ2 2q2 q2 + 2q4 . However, the q2 terms suffer from divergence − 1 − 2 2 2 at small q2 (see 3.2.3). The divergence is artificial because 1 µ2 0 when q2 0, which makes physical sense, but§ it can cause instability in the FAST-PT code.− → →
79 107
6 5 10 ] h
105 [Mpc/ ) k ( 4 S 10
103
6 10 5 × − 4 10 5 × − 2 10 5 × − 0 2 10 5 − × − 4 10 5 − × − 6 10 5 fractional difference − × − 0.01 0.10 1.00 10.00 k [h/Mpc]
Figure 3.3: The FAST-PT result for the Ostriker-Vishniac effect integral S(k) in Eq. (3.36) (upper panel) and the fractional difference compared to the conventional method (lower panel).
80 where Θ is the fractional temperature fluctuation of CMB in the direction of nˆ as seen by the scattering electron. The relation between the quadrupole anisotropy at position x and the CMB temperature angular distribution seen by the scatter is given by Z Y ∗ (nˆ) Q(m)(x) = dΩ 2m Θ(x, nˆ) , (3.40) − √4π where m = 0, 1, 2. In the Rayleigh-Jeans limit,17 the observed power spectra of ± ± E- and B-mode polarizations are related to the power spectra of Q(0,±2) and Q(±1), respectively, by
2 Z ! 3π 3 2 (0) 1 X 2 (m) CEE = dw g2D ∆ (k) + ∆ (k) and ` 10`3 A 4 Q 8 Q m=±2 2 Z ! 3π 1 X 2 (m) CBB = dw g2D ∆ (k) , (3.41) ` 10`3 A 2 Q m=±1
2 (m) 3 (m) 2 (m) where ∆Q (k) = k P (k)/(2π ) is the variance of Q per unit range in ln k, the spherical harmonics in Eq. (3.40) are evaluated with k on the z-axis, g is the visibility
function, and the comoving angular distance DA = w (the comoving distance) in a
flat Universe. Since the quadrupole anisotropy arises from the quadratic Doppler
effect, in Fourier space with kˆ = zˆ, we have
Z ∗ Z 3 (m) Y2m(nˆ) d q1 Q (k) = dΩ nˆ vb(q1)nˆ vb(q2) , (3.42) − √4π (2π)3 · · where vb is the baryon bulk velocity. In linear theory
∇ vb δ˙ = · = fHδ , (3.43) − a
17This limit is necessary to justify saying that temperature is scattered – really it is the intensity, but at low frequencies the two are proportional. As noted in Sunyaev & Zeldovich (1980), the kinetic polarization has a specific non-blackbody spectral shape, which can be used to scale from the Rayleigh-Jeans limit to any frequency of interest.
81 where f d ln G/d ln a for growth factor G and scale factor a. Taking the Fourier ≡ transform and assuming no vorticity, we obtain
δ(k) ˆ δ(k) ˆ vb(k) = iafH k iT k . (3.44) k ≡ k
Substituting Eq. (3.44) into Eq. (3.42) and applying identities (B.1, B.11), we have
T 2 Z d3q δ(q )δ(q ) Z (m) 1 1 2 ∗ ˆ ˆ ˆ ˆ ˆ Q (k) = 3 dΩ Y2m(n)(n q1)(n q2) √4π (2π) q1q2 · · T 2 Z d3q δ(q )δ(q ) Z 4π 2 1 1 2 ∗ ˆ = 3 dΩY2m(n) √4π (2π) q1q2 3 X ˆ ∗ ˆ ˆ ∗ ˆ Y1m1 (q1)Y1m1 (n)Y1m2 (q2)Y1m2 (n) m1m2 2 Z 3 2 T d q1 δ(q1)δ(q2) 4π = 3 √4π (2π) q1q2 3 r X 45 2 1 1 2 1 1 Y1m1 (qˆ1)Y1m2 (qˆ2) × 4π 0 0 0 m m1 m2 m1m2 2 2 r Z 3 T 4π 3 d q1 δ(q1)δ(q2) X 2 1 1 ˆ ˆ = 3 Y1m1 (q1)Y1m2 (q2) . √4π 3 2π (2π) q1q2 m m1 m2 m1m2 (3.45)
(m) (m) 0 3 3 0 Following the definition that Q (k)Q (k ) = (2π) P (m) (k)δ (k + k ), we have h i Q D 2 Z 3 27PQ(m) (k) d q1 Plin(q1)Plin(q2) X 2 1 1 P (m)(k) = Y (qˆ )Y (qˆ ) , 2 4 3 2 2 1m1 1 1m2 2 ≡ 4(4π) T (2π) q1q2 m m1 m2 m1m2 (3.46) which is a tensor convolution integral in the form of Eq. (3.3).
Conversion to FAST-PT Format
Since k zˆ, the kernels for each m can be written in terms of µ, µ1, µ2. Note k that m1, m2 can only be 0 or 1, so we can explicitly write down all the spherical ± harmonics and Wigner 3j symbols in the summation and transform to Legendre
82 polynomial products as before:
−2 −2 q1 q2 m = 0 : [2 + 6 2(µ1) + 2(µ) + 6 2(µ1) 2(µ2) 9 1(µ1) 1(µ2) 1(µ)] , 80π2 P P P P − P P P −2 −2 q1 q2 m = 1 : [1 2 2(µ1) + 9 1(µ1) 1(µ2) 1(µ) 8 2(µ1) 2(µ2)] , ± 160π2 − P P P P − P P −2 −2 q1 q2 m = 2 : [1 2 2(µ1) + 2(µ1) 2(µ2)] . (3.47) ± 80π2 − P P P
Note that the symmetry between µ1 and µ2 has been used to simplify the kernels.
The coefficients Aαβ are now trivially seen. `1`2`
In Figure 3.4, we show the FAST-PT result of P (m)(k) integrals (Eq. 3.46) for m = 0, 1, 2, respectively, and the fractional difference from a conventional method. ± ± The plots show excellent agreement between two methods with accuracy better than
6 10−5 in the k range from 0.01 to 10 h/Mpc. × 3.3.4 Redshift Space Distortions Theory
Cosmological surveys map large-scale structure in three dimensions, using galaxies or other luminous tracers of the total matter distribution (e.g., Levi et al., 2013;
Dawson et al., 2013; Laureijs et al., 2011; Spergel et al., 2013; Dark Energy Survey
Collaboration et al., 2016). To determine distance along the line-of-sight, surveys typically use redshift information and are thus actually making a map in “redshift space.” In order to compare theory to galaxy redshift survey data, models must be translated into redshift space.
Tracers tend to infall towards overdense regions and, due to the Doppler effect, will thus have observed redshifts that deviate from those predicted by pure cosmo- logical expansion. These deviations cause “redshift-space distortions” (RSDs) in the observed tracer distribution. Although at highly nonlinear scales RSDs are no longer
83 108 106
7 4 ] 10 h 102 0 [Mpc/ 10 )
k 2 ( 10 ) − m
( 4
P 10− 6 (0) ( 1) ( 2) 10− P (k) P ± (k) P ± (k) 8 10− 6 10 5 × − 4 10 5 × − 2 10 5 × − 0 2 10 5 − × − 4 10 5 − × − 6 10 5 fractional difference − × − 0.01 0.10 1.00 10.00 0.01 0.10 1.00 10.00 0.01 0.10 1.00 10.00 k [h/Mpc] k [h/Mpc] k [h/Mpc]
Figure 3.4: The FAST-PT results for the kinectic CMB polarization integrals P (m)(k) in Eq. (3.46) (upper panels) and the fractional difference compared to the conventional method (lower panels).
84 well-described by perturbation theory, e.g., the “Fingers of God” (FoG) effect (Jack- son, 1972), we can still explore the mildly nonlinear regime via perturbation theory, avoiding time-consuming numerical simulations.
The “textbook” model for linear RSDs, the Kaiser effect (Kaiser, 1987), relates the matter power spectrum in redshift space matter to that in real space matter with an angular-dependent bias factor related to the growth rate of structure. Subsequently,
Scoccimarro (2004) improved the Kaiser model by distinguishing Pδθ and Pθθ from
Pδδ, where θ is the divergence of velocity field. In the linear regime of standard
perturbation theory, these three power spectra are equal to each other.
The TNS model (Taruya et al., 2010) accounts for the nonlinear mode coupling
between density and velocity fields, improving the modeling of the matter power
spectrum in redshift space across a range of scales (including the BAO scale). Fixing
k along the zˆ direction, and defining θn as the angle between nˆ (the line-of-sight
direction) and k, with µn cos θn, the density power spectrum in the redshift space ≡ can be written:
(S) 2 2 4 P (k, µn) = DFoG[kµnfσv] Pδδ(k) + 2fµnPδθ(k) + f µnPθθ(k) + A(k, µn) + B(k, µn) , (3.48)
where DFoG[kµnfσv] encapsulates the contribution from the FoG effect. The A, B
terms are tensor convolution integrals given by
Z 3 ¯ A(k, µn) d q1 q1n A(k, µn) = 3 2 [Bσ(q1, q2, k) Bσ(q1, k, k q1)] , (3.49) ≡ kµnf (2π) q1 − − − − Z 3 ¯ B(k, µn) d q1 B(k, µn) 2 = 3 F (q1)F (q2) , (3.50) ≡ (kµnf) (2π)
85 where k = q1 + q2, and the subscript “n” denotes the projection onto nˆ, e.g., q1n ≡
q1 nˆ, and · q q2 F (q) = n P (q) + f n P (q) . (3.51) q2 δθ q2 θθ
The cross bispectra Bσ is defined by
2 2 k2n k3n 3 θ(k1) δ(k2) + f 2 θ(k2) δ(k3) + f 2 θ(k3) = (2π) δD(k1+k2+k3)Bσ(k1, k2, k3) . k2 k3 (3.52) ¯ ¯ The convolution integrals A(k, µn) and B(k, µn) are particularly time-consuming (e.g.,
Bose & Koyama (2016)) and are ideal applications for our algorithm.
B¯ Term ¯ Substituting the F (q) kernel into the B(k, µn) integral, we obtain
Z 3 2 2 2 ¯ d q1 q1nq2n 2 q1nq2n q2n B(k, µn) = 3 2 2 Pδθ(q1)Pδθ(q2) + f 2 2 Pθθ(q1)Pθθ(q2) + 2f 2 Pδθ(q1)Pθθ(q2) (2π) q1q2 q1q2 q2 Z 3 d q1 (qˆ1 nˆ)(qˆ2 nˆ) = 3 · · Plin(q1)Plin(q2) (2π) q1q2 Z 3 3 3 2 d q1 (qˆ1 nˆ) (qˆ2 nˆ) + f 3 · · Plin(q1)Plin(q2) (2π) q1q2 Z 3 3 d q1 (qˆ1 nˆ)(qˆ2 nˆ) + 2f 3 · · Plin(q1)Plin(q2) . (3.53) (2π) q1q2
As previously mentioned, Pδθ,Pθθ,Pδδ are all equal to Plin at the leading order. Since
p1 p2 terms in the form of (qˆ1 nˆ) (qˆ2 nˆ) with non-negative integers p1, p2 can always · · be decomposed as a polynomial in terms of kˆ nˆ after longitude angle averaging (see · Appendix B.3 for a proof), it is natural to write B¯ as
¯ X i B(k, µn) = Bi(k)µn , (3.54) i=0 where each Bi(k) is a tensor convolution integral that can be written in terms of products of Legendre polynomials.
86 A¯ Term
The cross bispectrum satisfies Bσ(k1, k2, k3) = Bσ( k1, k2, k3) = Bσ(k1, k3, k2), − − − so we can write the A¯ integral as
Z 3 ¯ d q1 q1n A(k, µn) = 3 2 [Bσ(q1, q2, k) Bσ( q1, k + q1, k)] . (3.55) (2π) q1 − − − −
Changing the dummy variable q1 to q1 in the second term, we have − Z 3 Z 3 ¯ d q1 q1n d q1 q1n A(k, µn) = 3 2 [Bσ(q1, q2, k)+Bσ(q1, k q1, k)] = 2 3 2 Bσ(q1, q2, k) . (2π) q1 − − − (2π) q1 − (3.56)
Expanding the left-hand side of Eq. (3.52) to the leading order, we have
2 2 q2n kn Bσ(q1, q2, k) =2 1 + 2 f 1 + 2 f G2(q2, k)Plin(q2)Plin(k) − q2 k − 2 2 kn q2n + 2 1 + 2 f F2(q1, k) + 2 fG2(q1, k) Plin(q1)Plin(k) k − q2 − 2 2 q2n kn + 2 1 + 2 f F2(q1, q2) + 2 fG2(q1, q2) Plin(q1)Plin(q2) . q2 k (3.57)
¯ Similarly, we can expand the integral A as a polynomial in terms of µn:
¯ X i A(k, µn) = Ai(k)µn . (3.58) i=0
Each Ai(k) can be separated into two parts:
I II Ai(k) = Ai(k) + Ai (k) , (3.59)
I where Ai(k) is a convolution integral with Plin(q1)Plin(q2) in the integrand, while
II Ai (k) has an integrand with Plin(q1)Plin(k) or Plin(q2)Plin(k), which is similar to the
P13 integral and can be treated in a similar fashion.
87 Aαβ ` ` ` `1`1` 1 2 i = 0 i = 2 i = 4 i = 6 1 3f f 2 3 3f 2 3f 21f 2 131f 2 Bi 0 1 1 /2 /10 /20 /2 + /20 /2 /20 /100 − − 2− 2 − 2 2 3 1 3f/10 + f /10 3f 6f /5 7f/2 3f /10 47f /25 2 − −2 − 2 2 3 3 f /20 21f /20 63f /20 231f /100 − 2 2 − 2 2 1 0 0 (1+f)/2 + 5f /36 1/2 + f /12 f/2 f /12 5f /36 2 − 2 − − 2 − 2 2 0 f/2 5f /18 3f + 4f /3 5f/2 + f /6 11f /9 − −2 2 − 2 − 2 2 2 5f /36 17f /12 53f /12 113f /36 − −
Table 3.2: The coefficient of each term in the Legendre polynomial expansion of kernels of Bi(k). α = β = 1 for all the terms. Due to symmetry, we need only keep − terms with `1 `2 (multiplying the value by two where relevant). Empty entries are equal to the previous≥ row.
Conversion to FAST-PT Format
I The Bi(k) and Ai(k) integrals are standard convolution integrals, which can be decomposed into the form of Eq. (3.3). The associated coefficients Aαβ are listed in `1`2` Tables 3.2 and 3.3.
II The Ai (k) integrals are first decomposed into the form of
Z 3 αβγ d q1 α β γ ˆ ˆ P (k) = q q k ` (qˆ2 k) ` (qˆ1 k) `(qˆ1 qˆ2)Plin(q1)Plin(k) , (3.60) `1`2` (2π)3 1 2 P 1 · P 2 · P ·
with coefficients Aαβγ given by Table 3.4, so that for each AII integral, `1`2` i
X AII(k) = Aαβγ P αβγ (k) . (3.61) i `1`2` `1`2`
Note that for Plin(q2)Plin(k) terms one can always exchange the indices (1 2) of q and ↔
` in the integrand to recover the form above. For the special case that β = `1 = ` = 0 and `2 = 0, the integral vanishes. These P13-like integrals can be further reduced to 6 one-dimensional integrals and quickly calculated using discrete convolutions as done
88 Aαβ α β ` ` ` `1`1` 1 2 i = 1 i = 3 i = 5 I 68 2f 26f 2f 2 10f 2 Ai 1 0 0 0 1 /21 + /3 /9 + /3 /63 − 2 2 2 1 68f/21 340f/63 52f /21 260f /63 − − 2 2 1 1 0 2 + 124f/35 92f/105 + 108f /35 254f /105 − 2 − 2 2 2f 10f/3 2f 10f /3 − − 2 2 2 0 1 16/21 + 4f/3 4f/9 + 4f /3 52f /63 2 − 2 2 1 16f/21 80f/63 32f /21 160f /63 − − 2 2 3 1 0 16f/35 16f/35 + 32f /35 32f /35 − 2 − 2 2 1 0 1 0 2f/3 2f/3 + 2f /3 2f /3 − − − 2 1 0 1 2 8f/3 2f /3 2 2 2 1 2f 10f/3 2f 10f /3 − − 2 2 2 1 0 4f/3 4f/3 + 4f /3 4f /3 − −
Table 3.3: The coefficient of each term in the Legendre polynomial expansion of I kernels of Ai(k). The empty entries mean that they equal to the previous row.
for P13 in McEwen et al. (2016).
Z d3q k q µ kµ q P αβγ (k) =kγP (k) 1 qαqβ 1 2 (µ ) 2 1 P (q ) `1`2` lin 3 1 2 `1 − `2 2 ` − lin 1 (2π) P q2 P P q2 γ Z ∞ Z 1 k Plin(k) 2+α β k q1µ2 kµ2 q1 = 2 dq1 q1 Plin(q1) dµ2 q2 `1 − `2 (µ2) ` − , (2π) 0 −1 P q2 P P q2 (3.62)
p 2 2 where q2 = k + q 2kq1µ2. The angular (µ2) integration can be performed ana- 1 − lytically.18 Summing the components, we find:
2 Z ∞ II k Plin(k) Ai (k) = 2 drPlin(kr)Zi(r) , i = 1, 3, 5, (3.63) 672π 0
18 There are several ways to do this; a brute-force approach is to write µ2 in terms of q2 (at fixed k and q1), which turns the integral into a linear combination of power laws in q2.
89 where
18f 2 4 Z1(r) = 152 66f + (192 66f)r (72 18f)r r2 − − − − − 9f 36(1 f) 3 5 1 r + + − 54(2 f)r + 36(3 f)r 9(4 f)r ln − , r3 r − − − − − 1 + r (3.64)
18f(1 + f) 2 2 2 2 4 Z3(r) = 370f 66f + (318f 66f )r (126f 18f )r r2 − − − − − 9f(1 + f) 36f(1 f) 3 5 1 r + + − 54f(3 f)r + 36f(5 f)r 9f(7 f)r ln − , r3 r − − − − − 1 + r 2 2 18f 2 2 2 2 4 9f 2 2 3 2 5 1 r Z5(r) = 218f + 126f r 54f r + 54f r + 72f r 27f r ln − . r2 − − r3 − − 1 + r The integral (3.63) is a convolution. Upon making the substitution r = e−s, Eq. (3.63)
becomes
2 Z ∞ II k Plin(k) −s log k−s −s Ai (k) = 2 ds e Plin(e )Zi(e ) 672π −∞ 2 Z ∞ k Plin(k) = 2 ds Gi(s)F (log k s) , (3.65) 672π −∞ − −s −s s where Gi(s) e Zi(e ) and F (s) Plin(e ). We can convert to a discrete convo- ≡ ≡
lution with the substitutions ds ∆, log kn = log k0 + n∆, and sm = log k0 + m∆ →
(where k0 is the smallest value in the k array): Z ∞ N−1 X D D ds Gi(s)F (log k s) ∆ Gi (m)F (n m) , (3.66) −∞ − → m=0 − D D where in the final line we define the discrete functions G (m) Gi(sm) and F (m) i ≡ ≡ F (m∆). We then have
2 II knPlin(kn)∆ D D A (kn) = [G F ][n] , i = 1, 3, 5. (3.67) i 672π2 i ⊗ II 2 Thus Ai (k), which at first appears to involve order N steps (an integral over N
samples at each of N output values kn), can in fact be computed for all output kn in
(N log N) steps19. O 19In principle, N is the size of the input k array. However, to suppress the possible ringing and alising effects, we need to apply appropiate window functions, zero-padding or extend the input
90 Note that some integrals P αβγ (k) suffer from a divergence due to contributions `1`2`
II from small-scales. When summing to get Ai , the divergent parts cancel each other
−3 precisely. Taking q1 to be large, so that q2 q1 and Plin(q1) q , we have → − ∝ 1 ( 1)`+`1 δ kγP (k) Z Z P αβγ (k) `1`2 lin dq q2+α+βP (q ) dq qα+β−1, (3.68) `1`2` − 2 1 1 lin 1 1 1 → (2`1 + 1)2π ∝ so that the divergence appears when `1 = `2 and α + β 0. In Table 3.4 there are 5 ≥ terms that suffer from this divergence problem. However these divergences cancel in
II Ai ; in our case, the cancellation occurs when doing the sum over (α, β, γ, `1, `2, `) to derive Zi(r).
In Figures 3.5, we show the FAST-PT results of A + B terms in the TNS model
(Eq. 3.48) for f = 1 and µn = 0.05, 0.5, 0.9, respectively, as well as the fractional difference compared to our conventional method. The plots show excellent agreement between two methods with accuracy at the 10−4 level for most of the k range from
0.01 to 10 h/Mpc. Note that the individual A and B terms agree to significantly higher precision ( 10−5). Cancellations among terms in the total A + B amplify the ∼ fractional difference, especially at high k and near the zero-crossing.
3.4 Summary
In this paper we have extended the FAST-PT algorithm to treat 1-loop convolu- tion integrals with tensor kernels (explicitly dependent on the direction of the ob- served mode). The generalized algorithm has many applications – we have presented quadratic intrinsic alignments, the Ostriker-Vishniac effect, kinetic CMB polariza- tions, and a sophisticated model for redshift space distortions. Our algorithm and power spectrum into a larger range. The true value of N is usually a few times larger than the original value, depending on the user’s inputs and options.
91 Aαβγ γ α β ` ` ` `1`1` 1 2 i = 1 i = 3 i = 5 II 108f 36f 108f 2 36f 2 Ai 0 1 0 0 2 1 /35 /7 /35 /7 − − − 2 2 3 32f/35 32f/21 32f /35 32f /21 − − 2 2 1 1 0 52f/21 52f/21 + 52f /21 52f /21 − 2 − 2 2 32f/21 32f/21 + 32f /21 32f /21 − 2 − 2 0 1 0 1 0 52/21 32f/105 80f/21 32f /105 4f /3 − − − 2 2 2 32/21 428f/147 1012f/147 428f /147 788f /147 − − 2 2 4 192f/245 64f/49 192f /245 64f /49 − − 2 2 1 0 1 108f/35 108f/35 + 108f /35 108f /35 − 2 − 2 3 32f/35 32f/35 + 32f /35 32f /35 − − 2 1 0 0 0 0 0 2/3 8f/9 2f /9 − − − 2 − 2 2 0 2f/3 10f/9 + 2f /3 10f /9 − 2 − 2 2 4f/3 20f/9 + 4f /3 20f /9 1 1 1 2f −2f 2f 2 −2f 2 − − 2 2 1 1 0 1 1 2 + 4f/5 4f + 4f /5 2f − − − 2 − 2 3 6f/5 2f + 6f /5 2f − 2 − 2 1 0 0 2f/3 2f/3 2f /3 2f /3 − − 2 2 2 4f/3 4f/3 4f /3 4f /3 − − 2 1 2 0 0 0 0 2/3 8f/9 2f /9 − − − 2 − 2 2 0 2f/3 10f/9 + 2f /3 10f /9 − 2 − 2 2 4f/3 20f/9 + 4f /3 20f /9 1 1 1 2f −2f 2f 2 −2f 2 − − 2 2 1 1 0 1 1 2 + 4f/5 4f + 4f /5 2f − − − − 2 − 2 3 6f/5 2f + 6f /5 2f − 2 − 2 1 0 0 2f/3 2f/3 2f /3 2f /3 − − 2 2 2 4f/3 4f/3 4f /3 4f /3 − −
Table 3.4: The coefficient of each term in the Legendre polynomial expansion of II kernels of Ai (k).
92 f = 1.0, = 0.05 f = 1.0, = 0.50 f = 1.0, = 0.90 104 µn µn µn A + B 103
3 (A + B) ]
h 102 − 101 [Mpc/
B 0
+ 10 A 1 10− 2 10− 1 10 3 × − fractional difference 5 10 4 × −
0
5 10 4 − × −
1 10 3 − × − 0.01 0.10 1.00 10.00 0.01 0.10 1.00 10.00 0.01 0.10 1.00 10.00 k [h/Mpc] k [h/Mpc] k [h/Mpc]
Figure 3.5: The FAST-PT result for the redshift space distortion nonlinear corrections A(k, µn) + B(k, µn) in the TNS model, Eq. (3.48) (upper panels) and the fractional difference compared to the conventional method result (lower panels).
93 code achieve high precision for all of these applications. We have tested the output
of the code to high wavenumber (k = 10 h/Mpc), although we reiterate that the
smaller scales considered are beyond the range of validity of the underlying perturba-
tive models. The reduction in evaluation time is similar as for the scalar FAST-PT. For instance, execution time is 0.1 seconds for 600 k values in all our examples. In the ∼ results shown here, the input power spectrum was sampled at 100 points per log10 in- terval. We find that much of the noise (in comparisons with the conventional method)
is driven by the exact process by which the CAMB power spectrum is interpolated
before it is used in FAST-PT.
There are underlying physical concepts and symmetries that make the efficiency
of this algorithm possible. For example, the locality of the gravitational interactions
allows us to separate different modes in configuration space. Since the structure
evolution under gravity only depends on the local density and velocity divergence
fields, in Fourier space the 1-loop power spectra of the matter density as well as
its tracers (assuming local biasing theories) must be in form of Eq. (3.1), where the
kernels can always be written in terms of dot products of different mode vectors.
Without this locality, it may not be possible to write the desired power spectrum
as a sum of terms that can be calculated with this algorithm. The scale invariance
of the problem also indicates that we should decompose the input power spectrum
into a set of power-law spectra and make full use of the FFT algorithm. There
are also rotational symmetries that allow us to reduce the 3-dimensional integrals to
1-dimension.
This algorithm, and implementations of the examples presented here, are publicly
available as a Python code package at https://github.com/JoeMcEwen/FAST-PT.
94 Chapter 4: Dynamics of Quadruple Systems Composed of Two Binaries: Stars, White Dwarfs, and Implications for Ia Supernovae
In this chapter, I will present the full content of our paper Fang et al. (2018). The orginal abstract is given below:
We investigate the long-term secular dynamics and Lidov-Kozai (LK) eccentric- • ity oscillations of quadruple systems composed of two binaries at quadrupole
and octupole order in the perturbing Hamiltonian. We show that the fraction
of systems reaching high eccentricities is enhanced relative to triple systems,
over a broader range of parameter space. We show that this fraction grows with
time, unlike triple systems evolved at quadrupole order. This is fundamentally
because with their additional degrees of freedom, quadruple systems do not have
a maximal set of commuting constants of the motion, even in secular theory at
quadrupole order. We discuss these results in the context of star-star and white
dwarf-white dwarf (WD) binaries, with emphasis on WD-WD mergers and colli-
sions relevant to the Type Ia supernova problem. For star-star systems, we find
that more than 30% of systems reach high eccentricity within a Hubble time,
potentially forming triple systems via stellar mergers or close binaries. For WD-
WD systems, taking into account general relativistic and tidal precession and
95 dissipation, we show that the merger rate is enhanced in quadruple systems rel-
ative to triple systems by a factor of 3.5 10, and that the long-term evolution − of quadruple systems leads to a delay-time distribution 1/t for mergers and ∼ collisions. In gravitational wave (GW)-driven mergers of compact objects, we
classify the mergers by their evolutionary patterns in phase space and identify
a regime in about 8% of orbital shrinking mergers, where eccentricity oscilla-
tions occur on the general relativistic precession timescale, rather than the much
longer LK timescale. Finally, we generalize previous treatments of oscillations
in the inner binary eccentricity (evection) to eccentric mutual orbits. We assess
the merger rate in quadruple and triple systems and the implications for their
viability as progenitors of stellar mergers and Type Ia supernovae.
The orginal authors are: X. Fang, T. Thompson, C. Hirata.
4.1 Introduction
The dynamics of hierarchical triple systems has long been investigated. This is a special case of triple systems whose tertiary is at a large distance, serving as a perturber of the inner binary. At quadrupole order in the perturbing Hamiltonian, the eccentricity of the inner binary and the mutual inclination between the inner and outer orbit exhibit periodic oscillations, the Lidov-Kozai (LK) oscillations, on a timescale much longer than both of the orbital periods (e.g., Lidov, 1962; Kozai,
1962). Starting with a high tertiary inclination orbit, the initially low eccentricity of the inner binary can reach a very high value. Due to the high sensitivity of tidal inter- actions to the orbital eccentricity, this phenomenon has many potentially interesting astrophysical implications, such as inducing migration of planets and producing hot
96 Jupiters (e.g., Wu & Murray, 2003; Fabrycky & Tremaine, 2007; Hamers et al., 2017), tight binaries, stellar mergers, and even blue stragglers (e.g., Mazeh & Shaham, 1979;
Eggleton & Kiseleva-Eggleton, 2001; Tokovinin et al., 2006; Perets & Fabrycky, 2009;
Shappee & Thompson, 2013; Antognini et al., 2014; Naoz & Fabrycky, 2014; Antonini
et al., 2016; Petrovich & Antonini, 2017). The LK mechanism and its higher-order
effects have also found applications in systems with compact objects due to the strong
eccentricity dependence of the general relativistic precession and gravitational wave
(GW) dissipation. It has been proposed that this mechanism could be relevant to
the evolution of intermediate-mass black holes in globular clusters (e.g., Miller &
Hamilton, 2002; Wen, 2003) and super-massive black holes in the centres of galaxies
(e.g., Blaes et al., 2002; Antonini & Perets, 2012; Naoz et al., 2013b; Stephan et al.,
2016; Hoang et al., 2017; Antonini et al., 2016).
There have also been many discussions of white dwarf (WD) mergers as candidate
progenitors of Type Ia supernovae (SNe Ia): in this “double-degenerate scenario”
(DDS), two WDs gradually lose their orbital energy and angular momentum via
GWs before merging with each other (e.g., Webbink, 1984; Iben & Tutukov, 1984).
However, the GW energy dissipation rate suggests that the WD binaries have to start
with a compact orbit (semi-major axis a < 0.01 AU) to merge within the Hubble time,
calling for a mechanism to produce compact WD binaries. One of the proposals is
that the orbit rapidly shrinks during the common envelope phase, as the removal of
the common envelope takes away lots of orbital energy. Although some results from
binary population synthesis models have shown the possibility of explaining the SN
Ia rate with the DDS, they depend on the modelling of the common envelope physics,
97 and recent calculations under-predict the short-time-delay SN Ia rate (e.g., Ruiter
et al., 2009).
Given the uncertainties in the derived SN Ia rate from WD-WD binaries, it is
interesting to consider the role of triple systems, where gravitational dynamics could
lead to rapid gravitational wave driven mergers. Following work by Blaes et al. (2002)
and Miller & Hamilton (2002) in other contexts, Thompson (2011) showed that the
merger time for WD-WD binaries can be decreased by orders of magnitude by the
Lidov-Kozai (LK) eccentricity oscillations caused by the tertiary, and argued that
most SNe Ia may occur in hierarchical triple systems.
However, there are two potential problems with this hypothesis. The first is the
“inclination problem”: the LK oscillations that lead to a rapid WD-WD merger would
have already led to close encounters of the inner binary stars before they evolved into
WDs. Thus, although triple systems may play a role in forming the tight binaries that
eventually lead to WD-WD binaries through traditional common envelope evolution,
the tertiary would not participate in driving the merger of the two WDs per se.
The “eccentric LK mechanism” — octupole-order oscillations that can produce much
higher eccentricities than quadrupole-order LK oscillations when the components of
the inner binary have unequal masses (e.g., Ford et al., 2000; Lithwick & Naoz, 2011;
Katz et al., 2011; Naoz et al., 2013a) — could potentially exacerbate this issue by
driving more binaries to contact during stellar evolution (e.g., Perets & Fabrycky,
2009; Naoz & Fabrycky, 2014). In addition, Shappee & Thompson (2013) found
that mass loss can instigate the eccentric LK mechanism after the first WD forms,
potentially increasing the formation rate of tight WD-star systems. In an effort to
mitigate this issue, Antognini & Thompson (2016) investigated dynamical scattering
98 and flyby encounters as a way to generate high-inclination triple systems after WD binary formation, but significant uncertainty remains about the evolution of triple systems as their components evolve.
The second “rate problem” with the triple scenario is the same as that for the normal stellar binary channel for tight WD-WD binaries: it is unclear if the observed
SN Ia rate can be accommodated. In the triple scenario, in order to reach very high eccentricities, the initial inclinations are limited to a very narrow range in secular theory, potentially making it hard to explain the observed SN Ia rate. This issue was partially addressed by Katz & Dong (2012) who showed with N-body simulations that non-secular dynamics can produce “clean”, head-on collisions of WD-WD binaries in about 5% of moderately hierarchical triple systems. The possibility of such head- on collisions producing SNe was supported by Kushnir et al. (2013), who computed explosion models for colliding WDs. Piro et al. (2014) compared the expected Ia luminosity function in the collision scenario, finding that low-luminosity supernovae are preferred because of the observed strong peak in the WD mass function. In contrast with Katz & Dong (2012), Toonen et al. (2017) recently estimated the clean
WD-WD collision rate in triples and found it to be 2 3 orders of magnitude ∼ − lower than the observed SN Ia rate, with an almost uniform delay-time distribution that is inconsistent with observations. The role of mergers (rather than collisions) in producing the observed rate and delay-time distribution has not yet been explored.
In this paper, we calculate the secular dynamics of hierarchical triple systems and quadruple systems composed of two binaries with an eye towards addressing these “inclination” and “rate” problems. In particular, Pejcha et al. (2013) showed that the fraction of quadruple systems reaching high-eccentricity (high-e) is greatly
99 enhanced compared to otherwise identical triple systems, potentially suggesting an increased rate for quadruples. However, because Pejcha et al. (2013) used full few- body dynamics, their investigation of the parameter space was necessarily limited.
Here, we derive the equations for the secular evolution of quadruple systems including a treatment of general relativity and tides. We show that quadruple systems exhibit irregular behaviour even at quadrupole order, in contrast to the regular LK oscillations in triples at the same order. We further show that the high-e fraction produced by quadruple systems is large and that it grows steadily in time, producing mergers or collisions of WD-WD binaries. We find that the WD-WD merger rate is 3.5 10 ∼ − times larger than for triples, and that the majority of the mergers are highly eccentric inspirals or potentially collisions. The delay-time distribution for both quadruples and triples follows t−1. Given the relative fraction of observed triples and quadruples, ∼ these findings lead us to propose that quadruples may dominate WD-WD mergers.
We explore how the rate of mergers in quadruple systems depends on the WD masses, separation, and relative inclinations. We classify the mergers by their evolu- tion patterns in phase space right before their orbits rapidly shrink, and identify 8% ∼ of mergers that experience a previously unidentified “precession oscillation” phase at the beginning of their orbital shrinking.
An important component of the problem for both triples and quadruples is the role of non-secular dynamics. In particular, rapid eccentricity oscillations occurring on the timescale of the mutual or outer orbit — “evection” — can cause large perturbations to the angular momentum of the inner binary while it is at high eccentricity (e.g.,
Ivanov et al., 2005; Katz & Dong, 2012; Antonini & Perets, 2012; Bode & Wegg, 2014;
Antognini et al., 2014). Previous treatments have either relied on fully dynamical
100 calculations or analytic expressions derived in limiting cases. We generalize these previous analytic investigations to arbitrary mutual eccentricity, and assess evection for the merger and collision rate of both triple and quadruple systems over the range of semi-major axis ratios we explore. We find that evection slightly enhances the merger rates, but may have a more substantial effect on the nature of the merger
(e.g., whether head-on collisions or gravitational wave-driven mergers).
The remainder of this paper is organized as follows. In 4.2, we discuss the secular § effects we have considered in this work. In 4.3, we compare the secular results from § quadruple systems to the triple systems and discuss the new features we find in quadruple systems that could lead to important astrophysical implications. Then, in
4.4, we apply our calculations to the “quadruple scenario” of WD mergers and show § how it can shed light on the SN Ia rate puzzle. We discuss the role of evection in
4.5. Finally we summarize our results and discuss the caveats and limitations of our § work in 4.6. We tabulate the coefficients in the octupole perturbation formula in § Appendix C.1. Descriptions and tests of our secular code are presented in Appendix
C.2. The physics of a “precession oscillation” phenomenon in some WD mergers is explained in Appendix C.3. Detailed calculations pertaining to evection are described in Appendix C.4.
4.2 Secular Theory
The “2+2” hierarchical quadruple system is composed of two binary systems, with their centres of mass C1 and C2 separated by r, as shown in Fig. 4.1. The mutual orbit has parameters: a, i, e, g, and h, representing the semi-major axis, the inclination between the orbit and the reference plane in the rest frame, the eccentricity, the
101 Figure 4.1: Illustration of a “2+2” hierarchical quadruple star system. Masses m0 and m1 form “inner binary A” with separation r1, m2 and m3 form “inner binary B” with separation r2, and their centres of masses orbit each other in the “mutual” orbit with separation r. We focus on systems where r1, r2 r.
102 argument of the periastron, and the argument of the ascending node, as illustrated
in Figure 4.2. Each of the (inner) binary systems has a much smaller orbit. The first
one (we call it inner orbit A) is composed of masses m0 and m1 and separation r1,
and has orbital parameters: a1, i1, e1, g1, and h1; while the second one (i.e., inner
orbit B) is composed of masses m2 and m3 and separation r2, and has parameters: a2,
i2, e2, g2, and h2. Note that a1, a2 a, as defined by the “hierarchical” assumption.
We also define the inclinations between the inner orbits and the mutual orbit as iA
and iB, respectively.
In this section, we introduce the secular effects we have considered, including
up to octupole order in the expansion of the Hamiltonian ( 4.2.1,4.2.2), general §§ relativistic precession ( 4.2.3) and dissipation ( 4.2.5), and tidal precession ( 4.2.4) § § § and dissipation ( 4.2.6). At the end ( 4.2.7), we will discuss other effects that we § § have ignored and justify why they will not jeopardize our results. The role of the
non-secular effect, evection, will be considered in 4.5. § 4.2.1 Newtonian gravity and quadrupole order interactions
In the point-mass limit, the Hamiltonian of the quadruple system is given by
m0m1 m2m3 = G G + Tout + V02 + V03 + V12 + V13 , (4.1) H − 2a1 − 2a2
Gmimj where Tout is the kinetic energy of the mutual orbital motion and Vij = are the − rij gravitational potential energy between two objects in different inner orbits. The first
two terms on the right-hand side are equal to the total energy of the inner Keplerian
orbits, where = 4π2 AU3 yr−2 M −1 is Newton’s constant. G
We can write the Hamiltonian of the system as an expansion in terms of α1 a1/a ≡ 2 2 and α2 a2/a. Keeping terms up to quadrupole order (α ), (α ), the Hamiltonian ≡ O 1 O 2 103 Figure 4.2: Illustration of the orbital elements. The orbital plane intersects the reference planex ˆ yˆ along the line of nodes with the direction of ascending node denoted by Ω.ˆ h defines− the argument of ascending node with respect to the reference plane, and g defines the argument of periastron in the orbital plane. The angle between the orbital angular momentum G and the z axis defines the inclination i, which is also the angle between the reference plane and− the orbital plane.
104 reduces to
m0m1 m2m3 mAmB = G G G H − 2a1 − 2a2 − 2a 3 2 a h 2 r1 2 G α1S1 (3 cos Φ1 1) − a r a1 − 2 2 r2 2 i + α2S2 (3 cos Φ2 1) , (4.2) a2 − where mA = m0 + m1 and mB = m2 + m3 are the masses of the inner binaries;
S1 = m0m1mB/mA and S2 = m2m3mA/mB are coefficients; and cos Φ1 rˆ1 rˆ and ≡ · cos Φ2 rˆ2 rˆ represent angles between the separation vectors, with rˆ1 being the unit ≡ · vector pointing from m0 to m1, rˆ2 pointing from m2 to m3, and rˆ pointing from C1
to C2. (See e.g., Harrington 1968 and Ford et al. 2000 for similar derivations.)
We adopt Delaunay’s canonical angle variables: l1, l2, l (mean anomalies); g1, g2, g
(arguments of periastron); and h1, h2, h (longitudes of ascending node) for both of
the inner orbits (with subscripts “1” and “2” respectively) as well as the mutual orbit
(without a subscript). Their conjugate actions are related to the orbital elements via
p p 2 L1 = m0m1 a1/mA,G1 = L1 1 e1,H1 = G1 cos i1, pG p − 2 L2 = m2m3 a2/mB,G2 = L2 1 e2,H2 = G2 cos i2, (4.3) pG −2 L = mAmB a/M, G = L√1 e , and H = G cos i, G −
where M mA + mB is the total mass of the system (Murray & Dermott, 2000). ≡ The Hamiltonian can then be written as
β β1 β2 = 2 2 2 H − 2L − 2L1 − 2L2 4 2 3 L1 r1 a 2 8B1 6 (3 cos Φ1 1) − L a1 r − 4 2 3 L2 r2 a 2 8B2 6 (3 cos Φ2 1) , (4.4) − L a2 r −
105 where the coefficients are
3 3 3 2 (m0m1) 2 (m2m3) 2 (mAmB) β1 = , β2 = , β = , G mA G mB G M 2 7 2 7 (mAmB) (mAmB) B1 = G 3 , and B2 = G 3 . (4.5) 16 (m0m1M) 16 (m2m3M) Only the last two terms in Eq. (4.4) represent quadrupole order corrections due to the “monopole-quadrupole” interaction between the two inner orbits20, i.e.,
= (Kepler) + (Kepler) + (Kepler) + (quad) + (quad) , (4.6) H Hmutual H1 H2 H1 H2 where the Kepler terms do not contribute to the secular equations of motion since the short-period motions will be averaged out and the angles l1, l2, l will become cyclic,
leaving their conjugate momenta invariant.
After averaging over the inner binary orbits and the mutual orbit, the quadrupole
part (quad) becomes H1 4 (quad) B L = 1 1 H1 −8G3L3 × 2 2 2 2 3 sin i 10e1(3 + cos 2i1) cos 2g1 + 4(2 + 3e1) sin i1 cos 2∆h1
2 2 2 + (1 + 3 cos 2i) (2 + 3e1)(1 + 3 cos 2i1) + 30e1 cos 2g1 sin i1
2 2 + 12(2 + 3e 5e cos 2g1) sin 2i sin 2i1 cos ∆h1 1 − 1 2 + 120e1 sin 2g1 sin 2i sin i1 sin ∆h1
2 2 120e sin 2g1 sin i cos i1 sin 2∆h1 , (4.7) − 1 (quad) where ∆h1 h1 h , and has the same form with the subscript“1”substituted ≡ − H2 with “2”.
20The Hamiltonian of each inner orbit part is expanded into a series of terms, where the monopole moment is the Kepler term, and the dipole moment vanishes since it is taken around the centre of mass. Thus, the lowest order perturbation terms are the two “monopole-quadrupole” terms and the next order are “monopole-octupole” terms, which will be discussed in 4.2.2. At higher order there would be two “monopole-hexadecapole” terms and one “quadrupole-quadrupole”§ term, the latter of which could produce orbital resonances, but we will leave these terms to a future work.
106 The equations of motion can be acquired by expressing the averaged Hamilto-
nian in terms of the canonical variables listed in Eq. (4.3), and applying Hamilton’s
equations. Note that the averaged quadrupole order Hamiltonian21 only depends
on ∆h1 and ∆h2 due to the conservation of the projected total angular momentum
Htot H1 + H2 + H, which leads to a simplification to the equations of motion, i.e., ≡
H˙ (quad) = H˙ (quad) H˙ (quad) . (4.8) − 1 − 2 4.2.2 Octupole order interactions
The octupole-monopole interaction between inner orbit A and B vanishes when
stars in the inner binary A have the same mass, due to the parity symmetry of the
gravitational potential of the inner binary A. (This holds for any odd-` moment of
binary A.)
Similar to the treatment of the quadrupole in 4.2.1, the octupole order gives two § additional terms in the Hamiltonian, (oct) and (oct). We focus on the first octupole H1 H2 term, which corresponds to the inner orbit A interacting with the mutual orbit ( (oct) H2 is similar). We have that
6 3 4 (oct) L1 r1 a 3 1 = 2C1 8 5 cos Φ1 3 cos Φ1 , (4.9) H − L a1 r − where22 2 9 (mAmB) (m0 m1) C1 G 5 −4 . (4.10) ≡ 4(m0m1) M
21This is true for any order. 22 Note that C1 here corresponds to β3 in, e.g., Ford et al. (2000) or Naoz et al. (2013a), for triple cases.
107 After double-averaging, we can write the Hamiltonian in the form of
3 (oct) X (m) (m) 1 = λ1f(G) cos mh + sin mh H m=−3 A B (m) (m) cos mh1 + sin mh1 , (4.11) × A1 B1
6 4 2 2 5 where the prefactors are defined as λ1 15C1L /2048L and f(G) √L G /G . ≡ 1 ≡ − In the Fourier series, the coefficients (m) and (m) are functions of g and i only, and A B (m) (m) , are functions of e1, g1 and i1 only. Their explicit expressions are listed in A1 B1 Appendix C.1. Note that for any m, we have the relation
(m) = (−m) , (4.12) A B
which is guaranteed by the fact that the potential is real and rotationally invariant.23
The corresponding contribution to the equations of motion is easier to evaluate
in this “separated” form. It is not necessary to rewrite the Hamiltonian in Eq. (4.11)
solely in the canonical variables. Instead, we can use the Jacobian to show
(oct) (oct) (oct) G1 ∂ 1 1 ∂ 1 g˙1 = 2 H + H , −e1L1 ∂e1 G1 tan i1 ∂i1 (oct) ˙ (oct) 1 ∂ 1 h1 = H , and −G1 sin i1 ∂i1 (oct) (oct) 1 ∂ 1 ∂ h˙ (oct) = H1 H2 , (4.13) −G sin i ∂i − G sin i ∂i while the other equations keep the canonical form. The additional equation is
(oct) (oct) ∂ ∂ G˙ (oct) = H1 H2 . (4.14) − ∂g − ∂g
Note that this is non-zero, so whereas the magnitude of the angular momentum of the mutual orbit is conserved at quadrupole order, it is not conserved at octupole order.
(oct) 23 P` ∗ im∆h1 One can rewrite 1 as m=−` `m 1,`m e , where `m, 1,`m are moments from mutual and inner orbitH A, respectively,hO and `ihO= 3. i O O
108 4.2.3 First-order Post-Newtonian (1PN) corrections
The general relativistic (GR) corrections to a binary star orbit can be expanded in
inverse powers of c. Expanding the corresponding Hamiltonian of the binary system
in such metric up to order 1/c2 gives the so-called 1PN correction, which sources the
leading part of the GR precession.
In the centre-of-mass frame, the 1PN Hamiltonian correction of the orbit A is
given by (e.g., Damour, 2014)
4 2 (1PN) µ1 p0 c 1 = (3ν1 1) 4 H 8 − µ1 2 2 2 p0 ν1 2 µ1 mA µ1 mA (3 + ν1) 2 + 2 pr0 G + G 2 , (4.15) − µ1 µ1 2r1 2r1 where p0 is the momentum of star “0” relative to the centre of mass of the binary, and the radial component is defined as pr0 p0 rˆ1. The reduced mass is µ1 m0m1/mA ≡ − · ≡ and we use the mass parameter
µ1 m0m1 1 ν1 = 2 . (4.16) ≡ mA mA ≤ 4
After averaging over the orbit and dropping the constant terms (since they do not
affect the equations of motion), we obtain the effective averaged 1PN Hamiltonian
2 2 (1PN,eff) 3 µ1mAL1 1 = G 2 2 , (4.17) H − c a1G1
which leads to an additional orbital precession
(1PN,eff) 3/2 (1PN) ∂ 1 3( mA) g˙1 = H = G 5/2 . (4.18) ∂G1 c2(1 e2)a − 1 1 The expression for the inner orbit B is similar.
109 4.2.4 Tidal precession
In the case of stars approaching each other during a close periastron passage, the
point-mass assumption is no longer a good approximation, and the initially spherical
stars are deformed due to the tidal forces exerted by their companions. This leads to
a correction in their gravitational potential, hence in their Hamiltonian.
Let Ri be the radius of star mi. Star m0 develops a quasi-static quadrupole
5 3 moment k0m1R /r (Blanchet, 2014), where r1 is the distance between the two ∼ 0 1 24 stars and k0 is the dimensionless Love numbers of the two stars , which depends on
their internal structure. The resulting Hamiltonian correction for the orbit A is given
by
(tide) 2 5 2 5 1 = G6 (m0k1R1 + m1k0R0) . (4.19) H −r1 The orbit average is
(tide) 2 5 2 5 2 3 4 = G m k1R + m k0R 1 + 3e + e , (4.20) H1 −a6(1 e2)9/2 0 1 1 0 1 8 1 1 − 1 which leads to an additional precession rate (Wu & Murray, 2003; Fabrycky & Tremaine,
2007):
1/2 (tide) 15( mA) m0 5 m1 5 3 2 1 4 g˙1 = 13/2G k1R1 + k0R0 1 + e1 + e1 . (4.21) a (1 e2)5 m1 m0 2 8 1 − 1 4.2.5 Gravitational wave dissipation
Due to the gravitational wave emission, the orbits gradually dissipate their energy and angular momenta. The orbital averaged dissipation rates are given by Peters
24 The tidal Love number, associated with quadrupole moment, is usually denoted as k2. Here we drop the subscript “2” for simplicity. We take k = 0.01 for WDs (Prodan & Murray, 2012) and k = 0.0138 for main-sequence stars (Claret, 1995; Lanza et al., 2011).
110 (1964). Converted into our notation, the relevant equations of motion for the inner
orbit A are
˙ (GW) 3 L1 32 m0m1mA 73 2 37 4 = G 7/2 1 + e1 + e1 , L1 −5c5a4 (1 e2) 24 96 1 − 1 7/2 2 1/2 ˙ (GW) 32 (m0m1) mA 7 2 G1 = G 1 + e1 , and − 5c5a7/2 (1 e2)2 8 1 − 1 ˙ (GW) ˙ (GW) H1 = G1 cos i1 . (4.22)
4.2.6 Tidal dissipation
Tidal dissipation is much more complicated due to the existence of various types of tidal interaction mechanisms. In principle, the tides are categorized into the “equi- librium tides” and the “dynamic tides.”
In the “equilibrium tide” models, the star is deformed to be roughly in equilibrium with the time-dependent potential of the system, and the viscosity of the stars dissi- pates the energy in the motion of the tides (e.g., Darwin, 1880; Alexander, 1973; Hut,
1980, 1981, 1982; Eggleton et al., 1998). At the end, the orbit is circularized and the spins of the stars are aligned with the orbital axis. However, the tidal dissipation rate via this channel is very small in well-separated binaries. The circularization timescale can be estimated by (e.g., Hut, 1981)
3 8 5 5 e1 R0 a1 a1 P1 Q a1 τe P1 ≡ −e˙1 ∼ mAτ R0 ∼ R0 τ ∼ n1 R0 G 1017 > , (4.23) n1 where τ is the time lag introduced by tidal dissipation and P1 and n1 are the period
7 and mean motion of inner orbit A. The tidal Q P1/τ is of order 10 for C/O WDs ∼
(e.g., Piro, 2011; Burkart et al., 2013). The ratio of orbital sizes a1/R1 is assumed to be larger than 100 due to the assumption that binaries are well-separate initially.
111 The result shows that the circularization timescale due to equilibrium tides is longer
17 than 10 inner orbital periods, i.e., much longer than the Hubble time with a1 AU. ∼ Thus, we neglect dissipation via equilibrium tides.
When one of the inner orbits is at high eccentricity, the tidal dissipation via dynamical tides (e.g., Zahn, 1975; Fabian et al., 1975; McMillan et al., 1987; Gol- dreich & Nicholson, 1989; Kochanek, 1992) may become dominant, especially when the tidal capture mechanism proposed by Fabian et al. (1975) occurs. During the close encounter near periastron, the time-dependent tidal forces will excite non-radial oscillation modes in the stars and transfer energy from the orbit into stellar oscilla- tions. A consequence is that the semi major-axis gradually decays while the orbit is circularized. Press & Teukolsky (1977) derived a general formula for this energy transfer rate during a close periastron passage in the parabolic limit, and numeri- cally computed the results for a polytropic stellar model with index n = 3, which is appropriate for massive stars (approximately constant entropy, radiation pressure dominated) or WDs near Chandrasekhar limit (i.e., relativistic electron gas). For low mass main sequence stars (with approximately fully convective monatomic gas) or normal WDs (i.e., non-relativistic degenerate electron gas), n = 3/2 is a better ap- proximation (e.g., Gingold & Monaghan, 1980; Giersz, 1986). In the work presented here, we implemented the fitting formula provided in Appendix B of Giersz (1986).
Our approach to tidal excitation at periastron assumes that the excited modes of the stars decay via either linear or non-linear damping before the next periastron passage so that it does not then feed energy back into the orbit. If this turns out not to be the case for a given system, the next step would be an analysis of coupling
112 the orbit to the dominant modes of the star (see e.g., Vick & Lai 2017 for a recent
exploration of the possible dynamics25).
4.2.7 Spin
We neglect the spins of the stars due to the dominance of the tidal effects at high eccentricities. We compare the precession rate due to the rotational (oblate spheroid) deformation of the stars to the precession rate caused by tidal deformation, which contains many of the same factors. Wu & Murray (2003) provides the rotational precession rate for equatorial orbits and the m1 m0 case: 2 5 (rot) 1 k1 Ω1 m0 R1 g˙1 = n1 2 2 , (4.24) 2 (1 e ) n1 m1 a1 − 1
where n1 is the mean motion of inner orbit A and Ω1 is the rotation angular frequency
of the star m1. In general there is a similar term corresponding to the distortion of the
star m0 due to its companion. At high-e, the ratio between the rotational precession
rate (Eq. 4.24) and the tidal precession rate (Eq. 4.21) is estimated by
(rot) 2 2 g˙1 Ω1 3 Ω1 (1 e1) , (4.25) (tide) n2 ˙2 g˙1 ∼ 1 − ∼ fp ˙ where fp is the orbital frequency at the periastron. Since the tidal effects are only ˙ important at very high-e,Ω1 can be orders of magnitude smaller than fp. For tidal
effects to be important in stellar binaries, we take e1 = 0.997, orbital period P1 = 10
yrs, then f˙−1 0.6 day, which is much smaller than the rotation period of most p ∼ of solar-like stars ( 24 days). For WDs, which are about 10−2 times smaller than ∼ −2 the Sun in radius, we take 1 e1 = 0.003 10 and the same orbital period, then − × f˙−1 50 s, still smaller than the typical WD rotation period, i.e., 102 103 s (e.g., p ∼ ∼ − Kawaler, 2004). For these reasons, we can neglect the stellar spins.
25Our assumption is equivalent to Eq. (26) of Vick & Lai (2017).
113 During the close passage, WDs can be spun up by dynamical tides due to the
angular momentum transfer associated with energy injection. Since we used a non-
spinning calculation of the tidal excitation during the encounter, we have to check
that the WD rotation velocity remains small compared to the pattern speed of the
excited modes (mainly the f-modes). During each passage, the energy injected to
some oscillation mode (with frequency ω, moment m, pattern frequency Ωp = ω/m)
is of order ∆Emode m0m1∆(1/a1), corresponding to an angular momentum change ∼ G
∆Gmode = ∆Emode/Ωp, hence a spin angular velocity change of Star “1” by ∆Ω1 ∼ 2 ∆Gmode/(m1R1). Thus, we have
∆Ω1 m0∆(1/a1) R1 ∆a1 ∆a1 G 2 , (4.26) Ωp ∼ (R1Ωp) ∼ a1 a1 a1 where we have used in the second step that the pattern speed for the f-mode is of
3 1/2 order the Keplerian speed [ m0/R ] . We conclude that in the time it takes to G 1
dissipate the orbital energy, the WD is spun up to a speed Ωp, and it is indeed safe to neglect its spin.
4.2.8 Non-secular effects
Recent work has shown the “double-averaging” of the Hamiltonian can fail to
describe the long-term evolution of LK cycles in moderately hierarchical systems (e.g.,
Katz & Dong, 2012; Bode & Wegg, 2014; Antognini et al., 2014; Luo et al., 2016b).
The failure of double averaging has been historically important in the problem of lunar
motion, where the ratio of outer to inner periods is PSun−Earth/PEarth−Moon 12. For ∼ example, the rate of precession of the Moon’s perigee due to solar perturbations is
roughly twice that predicted by double-averaging (e.g., Bodenmann, 2010). We will
114 discuss non-secular effects in 4.5, with a particular emphasis on evection (a short- § term variation of the inner binary’s eccentricity) due to its potential impact on close encounters.
4.3 Secular Evolution of Quadruple versus Triple Systems
The evolution of quadruple systems is generally much more irregular than triple systems (e.g., Hamers et al., 2015; Vokrouhlick´y, 2016; Hamers, 2017; Hamers &
Lai, 2017). To explore the secular evolution of these systems, we wrote our secular code, which is described and tested in Appendix C.2. In this section, we first use three special systems with increasing complexity to show the qualitatively different evolution patterns of quadruple systems from that of triple systems ( 4.3.1). Then, § we run systems with random orientations and highlight some important features of the evolution in quadruple systems ( 4.3.2-4.3.6), which illuminate our explorations of § the astrophysical implications presented in the next section. Finally, we run systems with orbital sizes and shapes sampled from given distributions, confirming our results over a large range of parameter space ( 4.3.7). § Only taking quadrupole order terms in the Hamiltonian and ignoring any other effect such as GR and tides, the additional degrees of freedom introduced by the second inner binary system makes the evolution of the whole system irregular. It is well-known that in the secular + quadrupole approximation, the triple problem is integrable (see e.g., Harrington 1968, but also the discussion in 4.3 of Naoz et al. §
115 2013a). This is because in the orbit-averaged problem, where each orbit has 2 non-
trivial degrees of freedom26, the triple system has 4 degrees of freedom and 4 com-
(quad) muting constants of the motion: the perturbation Hamiltonian ; the z-angular H1 27 2 momentum H + H1; the squared total angular momentum Gtot; and the outer an- gular momentum G. The first three of these are conserved due to time and rotational
symmetry, and the last is due to the accidental axisymmetry of the quadrupolar tidal
field of a Keplerian orbit. The fourth star adds two degrees of freedom but no new
commuting constants of the motion. Since the additional precession and dissipation
effects are only important at high-e, we can safely ignore them first and get a general
understanding of how the quadruple systems could behave differently from the triple
systems before the high eccentricities are reached. As a summary, we find a much
enhanced high-e fraction in quadruple systems comparing to its triple limit. This
result holds for different mass ratios, orbital sizes and initial shapes.
4.3.1 Examples
In this subsection, we explore three types of systems: (1) triple systems ( 4.3.1); § (2) [Star-Planet]-[Star-Star] systems ( 4.3.1); (3) “4-Star” systems ( 4.3.1). In each § § case, we will only include their secular effects from the Hamiltonian expansion up to
octupole order and ignore the GR and tidal effects.
26We count an angle and its conjugate action as a single degree of freedom, as usual in Hamiltonian mechanics. 27The explicit expression in terms of actions and angles can be built from the law of cosines: 2 2 2 p 2 2 2 2 G = G + G + 2[HH1 + (G H )(G H ) cos(h1 h)]. tot 1 1 − 1 − −
116 Elements Inner Orbit Outer Orbit e 0.1 0.3 a 10AU 1000AU i 50◦ 10◦ g 0 0 h 0 180◦
Table 4.1: The initial orbital elements of a hierarchical triple system consisting of three 1M stars, discussed in 4.3.1. The initial inclination between the inner and outer orbit is 60◦. §
Triple systems
Hierarchical triple systems have a rather regular secular evolution. Here we assume the system consists of three 1 M stars with initial orbital elements listed in Table 4.1.
The eccentricities of the inner and outer orbit and their mutual inclination are shown in Figure 4.3. The periodic oscillations of the inner eccentricity and the inclination are due to quadrupole order Hamiltonian, which leads to LK oscillations. The oscillations are in antiphase due to the conservation of the total angular momentum. The octupole order effect vanishes since the inner binary stars have equal masses. As a result, the outer orbit eccentricity is unchanged.
[Star-Planet]-[Star-Star] systems
The simplest non-trivial quadruple system is the [Star-Planet]-[Star-Star] system, i.e., adding a planet (nearly a test particle) to the triple stellar system. We assume the [Star-Planet] pair as the inner orbit A and the stellar pair as B. The stars are solar-mass and the planet has one Jupiter mass, i.e., 0.001M . The initial orbital elements are listed in Table 4.2. The eccentricities of the two inner orbits and outer orbit and their mutual inclinations are shown in Figure 4.4. Since the planet mass is
117 1.0 inner outer 0.8
0.6 e 0.4
0.2
0.0 60
55
50
45 inclination [deg] 40 0 20 40 60 80 100 t [Myr]
Figure 4.3: The evolution of the triple system discussed in 4.3.1. The upper panel shows the eccentricities of the inner and outer orbits, while the§ lower panel shows the inclination between the inner and outer orbits. The system exhibits the regular LK oscillation. The initial orbital elements of this example system are listed in Table 4.1.
118 Elements Inner Orbit A Inner Orbit B Mutual Orbit e 0.1 0.1 0.3 a 10AU 15AU 1000AU i 50◦ 50◦ 10◦ g 0 0 0 h 0 0 180◦
Table 4.2: The initial orbital elements of a hierarchical quadruple system, discussed in 4.3.1. The inner orbit A consists of a solar-mass star and a Jupiter-mass planet and§ the orbit B consists of a pair of solar-mass stars.
negligible comparing to the stars, the orbital evolution of the stellar binary is expected
to behave like that in a triple stellar system, i.e., exhibiting the regular LK oscillation
as in 4.3.1. However, the planet evolves rather irregularly, due to the fact that its § “Kozai action” is not constant even at the test particle limit (TPL) and quadrupole
order (e.g., Hamers, 2017; Hamers & Lai, 2017) (see C.2.3 for the conservation of § Kozai action in triple systems).
“4-Star” systems
The general “4-Star” systems behave rather chaotically. We assume the 4 stars
are all solar-mass, and their initial elements are listed in Table 4.3. The eccentricities
of the two inner orbits and outer orbit and their mutual inclinations are shown in
Figure 4.5. It is interesting to see that in this example, one of the stellar binaries
achieves very high eccentricity on a very long timescale, which is not possible for its
equivalent triple system (i.e., having a tertiary with mass 2M ) with the same initial
inclination. This opens the question of how much the fraction of systems evolving to
high eccentricity is enhanced in quadruple systems relative to triples. We will explore
the answer in the following subsections.
119 1.00 inner A inner B mutual 0.75
e 0.50
0.25
0.00
60
40 inclination [deg]
0 50 100 150 200 250 300 350 400 t [Myr]
Figure 4.4: The evolution of the quadruple system ([Star-Planet]-[Star-Star]) dis- cussed in 4.3.1. The upper panel shows the eccentricities of the inner and outer orbits, while§ the lower panel shows the inclinations between the two inner orbits and the outer orbit. The inner orbit B exhibits the regular LK oscillation, while orbit A evolves irregularly. The initial orbital elements of this example system are listed in Table 4.2.
Elements Inner Orbit A Inner Orbit B Mutual Orbit e 0.1 0.1 0.3 a 10AU 15AU 1000AU i 50◦ 70◦ 10◦ g 180◦ 0 0 h 0 0 180◦
Table 4.3: The initial orbital elements of a “4-star” quadruple system, discussed in 4.3.1. Both of the inner orbits consist of a pair of solar-mass stars. §
120 100
2 10− e
− 10 4
1 −
6 10− inner A inner B mutual
100
50 inclination [deg]
0 50 100 150 200 6450 6500 6550 6600 6650 t [Myr]
Figure 4.5: The evolution of the “4-star” system discussed in 4.3.1. The upper panel shows the eccentricities of the inner and outer orbits, while the§ lower panel shows the inclinations between the two inner orbits and the outer orbit. Both of the inner orbits evolve irregularly, and one of them reaches very high eccentricities that its equivalent triple counterpart system will not be able to reach with the same set of initial orbital elements. The initial orbital elements of this example system are listed in Table 4.3. Note that the high eccentricity shown in the plot is significantly more than sufficient for the stars to collide.
121 4.3.2 Enhanced high-e fraction
The fraction of systems that can reach high eccentricities is highly enhanced in the quadruple systems comparing to the triple systems.
For quadrupole order approximation of the test particle limit (TPL) of the inner companion on an initially circular orbit, the LK oscillation produces a maximal ec- centricity of the inner orbit, given by (e.g., Lidov, 1962; Kozai, 1962; Lidov & Ziglin,
1976; Innanen et al., 1997; Kinoshita & Nakai, 1999; Blaes et al., 2002; Wen, 2003;
Naoz et al., 2013a) r 5 2 ein,max = 1 cos i0 , (4.27) − 3
where i0 is the initial inclination angle between the inner orbit and the outer orbit. In
order to reach an eccentricity higher than ein,max, a high initial inclination is required,
2 2 i.e., cos i0 3(1 e )/5. Thus, the fraction to reach this in an ensemble of ≤ − in,max systems with randomly oriented inner and outer orbits is r 3 2 ftriple,TPL = 1 e . (4.28) 5 − in,max
28 For non-zero initial ein with Lin Lout, the relation still holds . For Non-TPL triple systems, retrograde systems have a higher chance to reach high eccentricities (e.g.,
Lidov & Ziglin, 1976; Naoz et al., 2013a).
28 Here we denote ein, eout as the eccentricities of the inner and outer orbits of triple systems, itot as the mutual inclination angle. From the angular momentum conservation of Gtot and Gout and the energy conservation, at maximal ein we have (Naoz et al., 2013a) q q 2 2 2 2 L (1 e ) + 2LinLout 1 e 1 e cos itot = const. and in − in − in − out 2 2 2 cos itot (1 + 4e ) 3e = const., in − in where the first equation yields
2 2 2 2 cos itot = (1 e ) cos i0/(1 e ) − in,init − in
in the Lin Lout limit. Combining it with the second equation, we recover Eq. (4.27). 122 When the inner binary stars have different masses, the octupole order term adds
complexity. Some previous work has shown its effect on the eccentricity evolution
(e.g., Ford et al., 2000; Katz et al., 2011; Lithwick & Naoz, 2011; Naoz et al., 2013a),
including the enhancement of the high-e fraction.
In quadruple star systems, three (instead of two) orbits interact with each other, making it hard to derive a relation as simple as Eq. (4.27). It is not even clear whether there is an upper limit of e1 less than 1, associated with any initial configuration.
Pejcha et al. (2013) has shown that the fraction of systems that reach high eccentricity is greatly enhanced in quadruple systems for several initial conditions with N-body
simulations. However, full dynamical simulations are too computationally expensive
to explore the huge range of the parameter space and for durations comparable to the
age of the Universe.
As an example, we employ the secular code for 105 systems with random orienta-
tions. The effects considered are the two quadrupole order perturbation terms (GR
and tidal effects will be considered in 4.4). We take 4 main sequence stars with initial § orbital configurations listed in Table 4.4. Due to the isotropy of space, we can always
choose an inertial frame with the coordinate axes in which the initial orientation of
the mutual orbit (or one of the inner orbits) is fixed, thus reducing the parameter
space of the initial conditions. Without loss of generality, we take i = 0.1 rad and g = h = 0. The initial values for cos i1 and cos i2 are drawn randomly from range
[ 1, 1], while g1, g2, h1, and h2 are drawn randomly from range [0, 2π]. As a result, −
the cosines of the inclination angles cos iA and cos iB are also uniformly distributed
in range [ 1, 1]. Each system runs for 10 Gyr if it is not stopped by meeting the − criterion that the periastron distance of one inner orbit is less than 3 times the sum
123 Elements Inner Orbit A Inner Orbit B Mutual Orbit m 1+1M 1+1M – e 0.1 0.1 0.3 a 10 AU 15 AU 1000 AU cos i [ 1, 1] [ 1, 1] cos 0.1 g [0−, 2π] [0−, 2π] 0 h [0, 2π] [0, 2π] 0
Table 4.4: The initial orbital configurations of the “4-star” hierarchical quadruple systems, discussed in 4.3.2. The orbital sizes and shapes are fixed, while their orien- tations are randomly§ sampled. Because the physics is independent of the orientation of the coordinate system, we can reduce the degree-of-freedom of the system by fixing the initial orientation of one of the orbits (here the mutual orbit).
of two stars’ radii, where the tidal effects could start to play an important role. This
criterion is equivalent to setting a maximal eccentricity ein,max = 1 6R /ain, i.e., −
e1,max = 0.9972, e2,max = 0.9981. Note that due to the equal masses, octupole terms
vanish. We consider unequal masses in 4.3.5. Including the GR precession may § detune the Kozai effect and lower the high-e fraction. However, to make this effect
substantial, the system would need to have GR precession timescale shorter than or
comparable to the instantaneous Kozai timescales, which turns out not to be the case
for the stellar systems considered in this section (t(ins) 0.3 Myr, t(1PN) 13 Myr, LK1 ∼ pr1 ∼
when the inner orbit A reaches e1,max, using Eqs. 4.39,4.40).
Figure 4.6 shows the distribution of the systems on the cos iA-cos iB plane that
later reach the maximal high eccentricity. For equivalent triple systems (i.e., the
inner orbit B is replaced by a single star with mass mB = m2 + m3 = 2M ), the corresponding region is very narrow and close to cos iA = 0, as expected. At t = 10
Gyr, the fraction of systems reaching the given maximal eccentricity is about 36.3%
in inner orbit A of quadruple systems, more than 6 times higher than that in triple
124 1.00 0.29 0.24
0.75 Orbit A Orbit B 0.21 0.23 0.50 0.18
0.25 0.15 0.17 ) B i
( 0.00 0.12 cos 0.12 0.25 0.09 −
0.50 0.06 − 0.06
0.75 0.03 −
1.00 0.00 0.00 − 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 − − − − − − − − cos(iA) cos(iA)
Figure 4.6: The initial mutual inclination distributions of systems that reach high-e before 10 Gyr, calculated from the 105 randomly oriented “4-star” systems discussed in 4.3.2, whose initial orbital configurations are listed in Table 4.4. The left panel shows§ systems whose inner orbits A reach high-e, while the right panel shows systems whose inner orbits B reach high-e. The higher density of the colour in each panel represents the higher fraction of high-e systems, and it is normalized to the total high-e fraction of each inner orbit. Between the two green dashed lines in the left panel shows the high-e systems from the equivalent triple case, where the inner orbit B is replaced by a single star with mass mB = m2 + m3 = 2M .
systems ( 5.8% from the run, consistent with the analytical result calculated from ∼ Eq. 4.28)! Note that about 19.8% systems reach the given high eccentricity in their inner orbit B, so in total 56% systems become dynamically interesting in this orbital ∼ configuration.
4.3.3 Growing fraction over time
2 In triple systems, the regular LK oscillation has a timescale tLK P /Pin, which is ∼ out longer than any of the orbital periods, but shorter than the evolution timescales of the
125 main sequence stars with solar masses, for systems that are dynamically interesting.
Especially for WD mergers to produce SNe Ia, we hope the mergers to be able to
occur in a large range of timescales after the WDs are formed. At quadrupole order,
once the mutual inclination is specified and, thus, the maximum eccentricity ein,max,
all the systems that can reach ein,max will do so within the first LK cycle on a timescale t t , and after that there will be no more such events. ∼ LK Quadruple systems, however, have the ability to produce high-e events on timescale
much longer than tLK, up to the age of the Universe. Figure 4.7 shows the results
5 from running the 10 systems described in 4.3.2. The cumulative fraction f1, f2 (from § the inner orbits A and B) grow with the logarithmic time, i.e., about 10% and 4% per tenfold time. The fractional event rates Γf in this orbital configuration are thus roughly given by
˙ 0.10 Γf1 f1 for t > tLK1 and ≡ ∼ t ln 10 ˙ 0.04 Γf2 f2 for t > tLK2 . (4.29) ≡ ∼ t ln 10
−1 −1 We find that Γf1 0.4 Gyr at t = 0.1 Gyr and Γf1 0.04 Gyr at t = 1 Gyr. We ∼ ∼ will discuss the implications for the stellar merger rate and the SN Ia rate in 4.6. § 4.3.4 Orbital size dependence
Does the result we obtain in the previous subsections depend strongly on the
orbital sizes? In this subsection we investigate the dependence of the size of the
companion binary orbit. When a2 (semi-major axis of orbit B) is very small, the
orbital angular momentum of orbit B is small and the system reduces to the triple
system limit (more precisely, if we treat the binary B as two point masses). Increasing
a2 enhances the high-e fraction of the orbit A, as described in 4.3.2,4.3.3. However, § 126 40
35 Quads, A 30
25 ] %
[ 20 f Quads, B 15
10 Triples 5
0 106 107 108 109 1010 t [year]
Figure 4.7: The growing cumulative fractions of high-e events from the 105 randomly oriented “4-star” systems and their equivalent triple systems described in 4.3.2. At quadrupole order, the high-e fraction of triples stops growing after the Kozai§ timescale ( 6Myr), but for quadruples, the fraction keeps growing. ∼
127 when a2 is very large, so that the LK timescale tLK2 is very small comparing to tLK1,
the oscillatory perturbation exerted on the mutual orbit by the orbit B is rapid and
is averaged out. At the TPL of the orbit A, the high-e fraction of the orbit A should
drop and approach the triple system limit, where the initial inclination should instead
be estimated using the averaged angular momentum of the mutual orbit (Hamers &
Lai, 2017). However, this is only true when the Kozai timescales are much longer
than both inner and outer orbits, which sets an upper limit for the choice of a2. For non-TPL systems, it is not clear to what extent the effect from the orbit B is averaged out and suppressed, so that it is likely that the triple limit may not be reached in the valid range of a1 a2 a. In Figure 4.8 we show how the percentages of systems whose “A” and “B” inner orbits reach the high-e, f1 and f2 respectively, change with different values of a2 ranging from 8.5 AU to 22.0 AU with a step size of 0.1 AU (for the set of random
4 oriented systems described in 4.3.2 but with only 10 systems for each a2 and only up § to 5 Gyr). The total fraction f1+f2 is also plotted. We can see that the high-e fraction of the inner orbit A is much larger than the equivalent triple case ( 5.8%) for a large ∼ range of a2. Thus, we confirm that our result that quadruple systems can largely enhance the high-e fraction is true for a broad range of orbital size configurations.
To confirm the expectation of the “triple system limit” for large and small a2 while avoiding large computational costs, we perform the following two tests: (1) 105
4 systems with a2 = 1 AU; (2) 10 systems with a2 = 100 AU. The second test satisfies the requirement P t t , where P is the period of the mutual orbit. In each LK2 LK1 test, we run systems with random orientations up to 5 Gyr, and assume the inner binary B is composed of two point masses (i.e., ignoring any high-e event from the
128 orbit B, f2 0). In Test (1) we obtain f1 = 5846/100000 5.8%, in agreement with ≡ ∼ our expectations, while in Test (2) we obtain f1 = 740/10000 7.4%, confirming the ∼ descending trend of f1 at large a2. It is not surprising that f1 does not reach the triple-limit because the system is not in the TPL.
4.3.5 Mass ratio dependence
In triple systems whose inner binary stars do not have equal masses, octupole order perturbations enhance the high-e fraction on a much longer timescale than the LK timescale. However, in quadruple systems where the enhancement has been large, the contribution from octupole order terms becomes insignificant, because the systems that will reach high-e under octupole order effect would likely have reached them under quadrupole order effect due to the second binary.
For a better comparison, we also plot the fractions from quadruple systems and their equivalent triples with different m0/m1 ratios but the same mA(= 2M ) and ein,max values in Figure 4.9, where for each mass ratio value, the plot shows the fraction growth curves for quadruple and triple systems with octupole order effects turned on or off. We can see quadruple systems produce much higher high-e fractions than their equivalent triple cases, and octupole order contribution is negligible in these quadruple system configurations.
Note that the absolute values of f have a strong dependence on the radii of stars since the ein,max values also depend on the radii. Thus, for WD binaries the f values are expected to be much smaller, as shown in 4.4. §
129 50
a2 = a1 Total 40
] 30
% Orbit A [ f
20 Orbit B
10 Triples
0 10 12 14 16 18 20 22 a2 [AU]
Figure 4.8: The high-e fractions from the inner orbit A, inner orbit B and the total fraction vary as functions of the semi-major axis of the inner orbit B, described in 4.3.4. a2 is evenly sampled from 8.5AU to 22.0AU, with a stepsize 0.1AU. For each § 4 sampled a2, we run 10 systems up to t = 5 Gyr. The rest of initial orbital elements are listed in Table 4.4.
130 m /m = 1 m /m = 3 m /m = 100 50 0 1 0 1 0 1 Triple,Quad Triple,Quad+Oct 40 QuadSys,Quad Quadruples QuadSys,Quad+Oct Quadruples Quadruples 30 ] % [ f 20
10 Triples Triples Triples
0 106 107 108 109 1010 106 107 108 109 1010 106 107 108 109 1010 t [year] t [year] t [year]
Figure 4.9: The high-e fractions of 105 randomly oriented “4-star” systems with mass ratios 1 (left), 3 (center), 100 (right) in the inner orbits A, discussed in 4.3.5. The other initial orbital elements are listed in Table 4.4. Their equivalent triple§ cases (i.e., replacing the inner binary B with a single 2M star) are plotted for comparison. The high-e fraction enhancement for quadruples is remarkably robust against variations in m0/m1.
4.3.6 Possible “safe” regions
In triple systems, we can already conclude that the high-e fraction ftriple is limited and the remaining fraction, 1 ftriple, is “safe” – i.e., at the order of approximations − chosen they will not merge even on timescales longer than the age of the Universe.
However, for quadruple systems, the results shown in 4.3.3 seem to tell us that the § event fraction keeps growing. But one might wonder whether there are “safe” regions in the initial parameter space where the system will never reach high eccentricity. In other words: does the fraction converge to some value fmax < 1 as t ? → ∞ At quadrupole order of Hamiltonian and neglecting all other effects, we tested 104 systems in the configuration described in 4.3.2 up to t = 1013 years and confirmed § the slow-down and convergence of the high-e fraction growth. Figure 4.10 shows that the fractions of reaching high-e in the two inner orbits converge to f1,max 47%, ∼
131 f2,max 26% for this case, leaving 27% of systems “safe” (i.e., never reaching high- ∼ ∼ e, at least on timescales of 1013 years). In Figure 4.11 we show that the “safe” regions are roughly at the corners of the cos iA-cos iB plane, where the inner orbits and the mutual orbit are nearly coplanar. Also, we notice that the “safe corners” are much larger when the two inner orbits’ angular momenta are in the same direction than when they are opposite.
4.3.7 Quadruple systems of main sequence stars
5 We run 10 systems for quadruple systems with four 1 M main sequence stars and sample the values of their semi-major axes and eccentricities from given distributions.
The eccentricities e1, e2, and e are sampled from the thermal or uniform phase space
2 2 2 density distribution (Jeans, 1919), i.e., e1, e2, and e are uniformly distributed in [0, 1].
The semi-major axes a1 and a2 are sampled from a log-normal distribution; log10 a1 (in AU) is assigned a mean of 1.7038 and a standard deviation of 1.52, inferred from
Figure 13 in Raghavan et al. (2010). a is then sampled assuming that a/(a1 + a2) is log-uniformly distributed in [9, 1900], based on observations of confirmed hierarchical multiple systems in 5.3.8 of Raghavan et al. (2010), although the sample size is small. § Also we impose the criteria that
1. a 104AU and a(1 e) 20AU; ≤ − ≥
2. the two inner orbits cannot be too close to each other, i.e., a(1 e) 10a1 and − ≥
a(1 e) 10a2; and − ≥
3. the two inner orbits are not initially too small, i.e., a1, a2 1AU. ≥
132 100
80 “Safe”
60 ] % [ Orbit A f 40
20 Orbit B
0
106 107 108 109 1010 1011 1012 1013 t [year]
Figure 4.10: The cumulative fractions of 104 randomly oriented“4-star”systems whose inner orbits A and B reach high-e, shown in blue and orange solid lines, respectively. The rest of systems are “safe” and are shown in the green line. The initial orbital configurations are listed in Table 4.4, and each system runs up to 1013 years, as discussed in 4.3.6. §
133 1.00 0.45
0.75 0.36 0.50
0.25 0.27 ) B i ( 0.00 “Unsafe”
cos 0.18 0.25 −
0.50 − 0.09 0.75 − “Safe” Corners
1.00 0.00 − 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 − − − − cos(iA)
Figure 4.11: The “safe” regions for the 104 randomly oriented “4-star” systems with parameters from Table 4.4 running up to 1013 years, as discussed in 4.3.6. All the systems that have never reached high-e are initially coplanar, and§ the “safe” corners are larger for those systems whose two inner orbits are in the same direction. The density of the colour represents the fraction of “safe” systems in that region of (cos iA, cos iB)-space, and it is normalized to the total “safe” fraction.
134 Note that criterion (ii) is not based on observation, but is imposed since secular
perturbation theory can break down for moderately hierarchical systems.
Other initial parameters, specifying the orientations of the orbits, are sampled
randomly as described in 4.3. The effect we include is quadrupole order only. The § stopping criterion for the integrations is that any of the inner stellar binary reaches
high-e so that they are strongly impacted by tidal effects, where we set rp,i ai(1 ≡ −
ei) 6R , (i = 1, 2). Such event is likely to produce close binaries or stellar mergers. ≤ For comparison, we also run 105 equivalent triple systems (i.e., stellar binary with a tertiary 2M star) and triple systems with tertiary mass 1M , with the same set of criteria adopted.
Figure 4.12 shows the total high-e fraction in sampled quadruple systems is about
31%, about 2.6 times higher than that from triples. Changing mass ratio of one inner binary is expected to increase the fraction, as shown in Figure 4.9. We run
105 quadruple systems with the same set of initial conditions except masses [1 +
0.5] + [1 + 0.5]M , and compare with both a set of octupole-order “equivalent” triple systems with 1.5 M tertiaries, and triple systems with 1 M tertiaries. The fraction of systems reaching high eccentricity increased by 2 3% in all cases. ∼ − 4.4 Implications for WD-WD Mergers
We have seen from the last section that quadruple systems can largely enhance the probability of reaching the high eccentricities in a “sustainable” way up to the age of the Universe, with only quadrupole order terms considered. As we can see from
4.2, more interesting physical effects show up at high eccentricities, such as the GR § effects and tidal effects. In this section, we will consider the hierarchical quadruple
135 35 Quad, A
30 Quad, B Quad, total Equivalent Triple, mB = 2M 25 Triple, mB = 1M
20 ] % [ f 15
10
5
0 106 107 108 109 1010 t [year]
Figure 4.12: The high-e cumulative fractions in 105 quadruple “4-star” systems versus in their “equivalent triple” systems (orange solid line) and triples with solar-mass tertiary (green solid line), as discussed in 4.3.7. The dashed and dot-dashed blue lines are high-e fractions from inner orbit A§ and B, respectively, which are almost the same because they are sampled from the same distributions, while the solid blue line is the sum of them, i.e., total fraction.
136 systems with a WD-WD binary and a main-sequence stellar binary, and discuss how
the enhanced high-e fraction can have implications for the WD merger rate.
4.4.1 Merger rate
In order to estimate the WD-WD merger rate, we need to run quadruple systems
with different initial parameters, including their orbital orientations as well as their
orbital sizes and shapes. For simplicity, we will only explore several configurations of
the masses and show that the enhancement of the merger rate is generally true for
all cases.
The majority of WDs are around 0.6 0.7M (e.g., Kepler et al., 2007, 2017). −
We start by taking the WD-WD binary as equal mass with m0 = m1 = 0.7M , and
the companion binary as solar-like stars, m2 = m3 = 1M . The radii of the WDs are
R0 = R1 = 0.0084R , estimated using the mass-radius relation (Hamada & Salpeter,
1961). The initial orbital elements are sampled as described in 4.3.7. The effects we § include are quadrupole order, the 1PN and tidal precession for both inner orbits, and
the GW and tidal dissipation for the inner orbit A, i.e., the WD-WD binary. The
stopping criteria for the integrations are as follows:
1. the WD-WD binary collides, i.e., rp1 R0 + R1, where rp1 a1(1 e1) is the ≤ ≡ − periastron distance of the WD-WD binary;
2. the orbital energy loss is of order unity per (inner) orbital period so that the
orbital-averaged dissipation rate formulae are not valid and the WD-WD binary ˙ could collide directly, i.e., P1 L1 L1; | | ≥
3. the WD-WD orbit shrinks significantly due to the GW and/or tidal dissipation,
i.e., a1 < 0.1AU;
137 4. the stellar binary (i.e., inner orbit B) reaches high-e so that they are strongly
impacted by tidal effects, where we set rp2 a2(1 e2) 3(R2 + R3); or ≡ − ≤
5. the integrator has taken 107 time-steps (i.e., 4 107 steps in RK4) so that we × regard the system to be dynamically inert and uninteresting. It is likely that
some of such systems would be unstable after an extremely long time (comparing
to their LK timescales), but the chance should be small due to the existence of
“stable regions” discussed in 4.3.6. §
The stopping criteria (i), (ii), and (iii) contribute to WD-WD mergers, and we will
call them channels (I), (II), and (III) respectively, while (iv) produces stellar mergers,
which are interesting in their own right but lie outside the scope of this section.
We run 105 such systems and, for comparison, we also run 105 equivalent triple
systems (i.e., WD-WD binary with a tertiary 2M star) and triple systems with
tertiary mass 1M (which is astrophysically more realistic) with the same set of criteria adopted. For unequal-mass WDs (e.g., 0.8 + 0.6M ), octupole order effects are turned on, and the merger rates are expected to increase for both quadruple and triple systems. Figure 4.13 shows that the overall enhancement of merger rates from quadruple systems with respect to their equivalent triples (or with 1M tertiary) is
9 (or 10) for the equal-mass case, while it drops to 3.5 (or 5) for the unequal- ∼ ∼ mass case. We also find that the Channel (I) contribution is negligible (only 2 in 105 for each run), while most of the mergers go through Channel (III), i.e., the orbital shrinking, shown by the dashed lines in Figure 4.13.
4.4.2 Understanding the results
From the overall results, there are two immediate questions:
138 m0 = m1 = 0.7M m0 = 0.8M , m1 = 0.6M 6 Quad Quad, Channel (III) 4 ] Equivalent Triple, mB = 2M
% [ Triple, mB = 1M f 2
0 6 10−
] 8 10− yr / % [
f 10
Γ 10−
12 10− 106 107 108 109 1010 106 107 108 109 1010 t [year] t [year]
Figure 4.13: The WD merger cumulative fractions (upper panels) and rates (lower panels) in 105 quadruple systems (i.e., [WD-WD]-[Star-Star]) (blue solid lines) versus in their “equivalent triple” systems (orange solid lines) and triples with solar-mass tertiary (green solid lines), as discussed in 4.4.1. The left panel shows results from § equal-mass WDs (both with 0.7M ), while the right panel shows results from unequal- mass WDs (0.8+0.6M ). The stellar masses in quadruple systems are both 1M and their “equivalent triple” systems have tertiary masses 2M . The blue dashed lines are ˙ the fractions of Channel (III) mergers in quadruple systems. The rates Γf f are ˜ ˜2 ˜3 ˜4 ˜ ≡ obtained by fitting f to polynomial f = A + Bt + Ct + Dt + Et , where t log10 t and A, B, C, D, E are fitting parameters. Note that we show the early-time rates≡ only for dynamical interest. Most of WDs form at late times depending on their progenitor masses, so we should only focus on the rate at late times.
139 1. Why is Channel (I) suppressed?
2. What role does each effect play?
Let us first understand what the meanings of the three channels are. For the equal-
mass case, Channel (I) is equivalent to having a1(1 e1) 2RWD, i.e., − ≤ −1 −5 a1 1 e1 7.8 10 . (4.30) − ≤ × AU
Channel (II) is equivalent to
L a 3/2 m −1/2 1 1 A , (4.31) ˙ L1 ≤ AU M | | which is equivalent to −5/7 −6 a1 1 e1 3.6 10 (4.32) − ≤ × AU if the dissipation is dominated by the GW emission. However, combining Eqs. (4.30) and (4.32) suggests that, in order to make Channel (II) more likely to happen than
4 Channel (I), we need a1 > 5 10 AU, which does not explain the suppression of ∼ × Channel (I).
In fact, Channel (II) is only made possible due to the tidal dissipation. For a simple order-of-magnitude estimation, we can use the tidal energy dissipation per
(inner) orbit from Press & Teukolsky (1977) (hereafter PT), rewritten in our notation as 2 " 6 8 # 2Gm0 RWD RWD ∆E = T2(η1) + T3(η1) . (4.33) RWD rp1 rp1
In this expression, the dimensionless function T` corresponds to excitation of multipole-
` modes of the WDs, and η1 is the ratio of the periastron passage timescale to the dynamical timescale of the WDs; for the equal-mass case, 3/2 1 rp1 η1 = . (4.34) √2 RWD 140 29 Since the T2 term usually dominates , Channel (II) is equivalent to having
5/6 5/6 RWD 1/6 RWD 1 e1 < 2 [T2(η1)] , (4.35) − ∼ a1 ∼ a1 where during the close passage, T2(η1) 0.01 0.1. In order to make Channel (I) ∼ − −3 happen before Channel (II), we need a1 < 4 10 AU, impossible for our initial ∼ × configurations. Thus, Channel (I) is largely suppressed.
Channel (III) assumes that when we detect a significant orbital shrinking (at least
a factor of 10), the WD binary will merge on a short timescale. This is reasonable
because orbital shrinking is only significant when the timescale of the energy dissi-
pation is smaller than the LK timescale, which is typically of order 107 years in our
initial configurations.
There are 3 major scales: a1, a, and rp1, and they determine the timescales of all
the effects we consider, hence their dominant regimes. Here we list all the relevant
timescales using the “AU, year, M ” unit system:
P = a3/2M −1/2 , (4.36)
3/2 −1/2 P1 = a1 mA , (4.37)
a3 a 3 t (1 e2)3/2 0.04 a3/2 , (4.38) LK,1 3/2 a 1 ' a1 − ∼ 1 q 3 (ins) 2 a 1/2 tLK,1 tLK,1 1 e1 0.07 a1rp1 , (4.39) ∼ − ∼ a1
(1PN) 7 3/2 t 4.1 10 a rp1 , (4.40) pr ∼ × 1 t(tide) 3.8 1023a3/2r5 , (4.41) pr ∼ × 1 p1 29 p T2 term dominates if RWD/rp1 T2/T3. From Fig. 1 in PT, T2/T3 1 when η1 approaches 2, ∼ and approaches 5/3 at large η1. However, RWD/rp1 < 1/2, so that the T2 contribution dominates. ∼ 141 t(GW) 9.5 1018a1/2r7/2 , and (4.42) diss ∼ × 1 p1 (tide,PT) 21 −1 1/2 6 t 5.5 10 [T2(η1)] a r , (4.43) diss ∼ × 1 p1
(ins) where P and P1 are the orbital periods of mutual and inner orbit A, tLK,1 stands
for the instantaneous LK timescale of the inner orbit A (Bode & Wegg, 2014), “pr”
stands for precession and “diss” stands for dissipation. The tidal dissipation used
here is from PT (see 4.2.6), which is much simpler than but at order-of-magnitude § level consistent with Giersz (1986), and its analytic form shows that it is negligible at
most of separation scales, but may take over when the periastron is small. We have
estimated the mutual orbit eccentricity e as 1/√2 due to its thermal distribution.
T2(η1) can be estimated as a power law
−2.47 η1 T2(η1) 0.4 (4.44) ∼ 2 for η1 2. Although this expression overestimates T2 when η1 is approaching 2, it is still good at the order-of-magnitude level.
For a = 2000AU and a1 = 10AU, we plot the timescales versus the periastron
of the inner orbit A, rp1, in Figure 4.14, where the tidal dissipation is calculated
from Giersz (1986) as we use in our code. The minimal rp1 shown in the figure
−5 is 2RWD = 7.8 10 AU, below which we assume a collision (Channel (I) event) × occurs. The shaded region represents Channel (II) region, and is determined by the
(tide) intersection between the tidal dissipation timescale tdiss and the WD-WD binary orbital period P1.
4.4.3 Classification of orbital shrinking
Channel (III), i.e., undergoing orbital shrinking, is divided into 3 categories, Type-
IIIL, Type-IIIC and Type-IIIS, based on how they fall into the shrinking phase. The
142 109
107 tLK (ins) tLK (1PN) 105 tpr (tide) tpr timescales [year] (GW) tdiss 103 (tide) tdiss P P1 101 2 3 4 10− 10− 10− rp1 [AU]
Figure 4.14: The timescales versus the periastron of the inner orbit A, rp1, for a system with a = 2000AU, a1 = 10AU. All the timescales except the tidal dissipation are calculated using Eqs.(4.36-4.42), while the tidal dissipation timescale is calculated from Giersz (1986) as we use in our code. The shaded region is Channel (II) region.
143 normal Kozai motion of triple systems has two types of trajectories: libration and
circularization. In quadruple systems, the inner orbits can switch between these two
types. The Type-IIIL mergers undergo rapid orbital shrinking when they are on
the libration trajectory, while the Type-IIIC mergers shrink on the circularization
trajectory. The Type-IIIS mergers are initially at the orbital shrinking phase, which
are of less interest. We also identify a subtype in each category, i.e., those systems that
show eccentricity oscillations on the GR precession timescale during their beginning
phase of orbital shrinking. We denote those “wiggled” systems with “w”, i.e., Type-
IIILw, Type-IIICw and Type-IIISw.
We examine a set of 104 [WD-WD]-[Star-Star] systems and find 491 (i.e., 4.91%) ∼ systems undergoing orbital shrinking (i.e., Channel III). Among these systems, 209
(49.9%) systems are Type-IIIL with 27 (6.4%) in Type-IIILw, 207 (1.4%) systems
are Type-IIIC with 6 in Type-IIICw, the rest of 3 (0.7%) systems are Type-IIIS with
1 (0.24%) in Type-IIISw. About 8% of the orbital shrinking systems experience the
“wiggled” phase during shrinking, which we will call “precession oscillation” phase
from now on.
In Figure 4.15 we show the example phase diagrams of Type-IIIL, Type-IIIC and
their “wiggled” subtypes. The “non-wiggled” subtypes (upper panel) show that at
the final stage the WD binaries go on extremely high eccentricities and then rapidly
shrink their orbits, so that they decouple from their companions and directly enter
the small circular trajectories on the phase diagram, where the angular momentum
is approximately conserved. The “wiggled” subtypes (lower panel) show that the
inner orbit angular momentum G1 oscillates several times before decoupling from
144 the companions (also see Figure C.9, C.11). We show the underlying physics of this
“precession oscillation” in detail in Appendix C.3.
4.5 Nonsecular effects: evection
The“double-averaging”procedure neglects non-secular effects, including the“rapid
eccentricity oscillations” of the inner orbits on the timescale of the outer period.
This effect was discovered in the motion of the Moon by Ptolemy, known as the
Moon’s“second inequality”and much later as“evection”(Ptolemy, 1515; Brown, 1896;
Toomer, 1984). In this section, we discuss how evection affects the eccentricities, and
show that our conclusion still holds that quadruples are more efficient in producing
mergers than triples.
The nature of evection is that the tidal torque on the inner orbit exerted by the
outer perturber varies and changes its sign four times during the period of the outer
orbit, as illustrated in Figure 4.16. The oscillations of the orbital elements on the
timescale of the outer period in the context of triple star systems were discussed by
Soderhjelm (1975). Assuming a circular outer orbit, the amplitude of the oscillation
of the inner binary angular momentum was derived by Ivanov et al. (2005) in the
high eccentricity limit of the inner orbit (i.e., ein 1). Later this phenomenon was → observed in simulations by Bode & Wegg (2014) in the test particle limit and by
Antonini & Perets (2012) in the equal-mass inner binary case, and discussed by Katz
& Dong (2012) in the WD-WD context. Its impact on GW observations has also
been discussed e.g., by Seto (2013). In the presence of the eccentric LK mechanism
(from octupole order) and the non-secular evection, the merger times can be orders
145 5
4
4
2 1 1
g g 3 sin sin 1 1 G G
2 0 2 √ √ 2 ≡ − ≡ − P P
2 − 1
0.00 t <1274.21Myr 0.00 t <145.43Myr ≤ ≤ 1274.21 t <2548.43Myr 145.43 t <290.86Myr 4 ≤ ≤ − 2548.43 t <3822.64Myr 0 290.86 t <436.29Myr ≤ ≤ 3822.64 t <5096.85Myr 436.29 t <581.73Myr ≤ ≤ 4 2 0 2 4 2 1 0 1 2 − − Q √2G cosg − − Q √2G cosg ≡ 1 1 ≡ 1 1
(a) Type-IIIL: Orbital shrinking from the libration(b) Type-IIIC: Orbital shrinking from the circular- trajectory ization trajectory
4 4
2 3 1 1 g g sin sin 1 1
G G 2 2 0 2 √ √ ≡ − ≡ − P P
1 2 −
0.00 t <1352.91Myr 0.00 t <42.62Myr ≤ ≤ 1352.91 t <2705.82Myr 42.62 t <85.25Myr ≤ 0 ≤ 2705.82 t <4058.73Myr 85.25 t <127.87Myr 4 ≤ ≤ − 4058.73 t <5411.65Myr 127.87 t <170.50Myr ≤ ≤ 4 2 0 2 4 3 2 1 0 1 2 3 − − Q √2G cosg − − − Q √2G cosg ≡ 1 1 ≡ 1 1
(c) Type-IIILw: Type-IIIL with “precession oscilla-(d) Type-IIICw: Type-IIIC with “precession oscil- tion” phase lation” phase
Figure 4.15: Classification of orbital shrinking WD mergers.
146 of magnitude shorter than predicted by double-averaged secular calculations in triple
systems with low outer orbit eccentricities (Antognini et al., 2014).
We generalize the derivation in Ivanov et al. (2005) (assuming circular outer orbit)
to triple systems with eccentric outer orbits in Appendix C.4. We find that in addition
1 to the “ 2 Pout” periodic eccentricity oscillations that we have seen in the circular outer
1 orbit case, now we also have “Pout”-periodic and “ 3 Pout”-periodic oscillations, due to the modulation of the tidal field (with frequency 2n) by the varying distance of the
perturber (with frequency n), as shown in Eqs. (C.33,C.34).
Since evection can cause rapid mergers and/or cause mergers/collisions in other- wise non-merger systems, we need to assess its importance in increasing merger rates for quadruple and triple systems. We first calculate the “upper bound” of the eccen- tricities of the inner orbits at each time-step using the equations derived in Appendix
C.4.1: v u " s #2 u q 15 m2 a3 F (e, i ) (bound) t 2 B 1 A e1 = 1 1 e1 3 2 3/2 (4.45) − − − 8 mA(mA + mB)a (1 e ) −
(bound) and similarly for e2 , where the functions F (e, iA/B) are defined in Eq. (C.42). Then we use these values to check whether some of the stopping criteria are satisfied.
If the upper bound is high enough to make a collision, then it could mean that there is
a collision due to evection, or that the evection amplitude is overestimated, so we will
switch to a tighter estimation as discussed in Appendix C.4.2. That is, we calculate
the “true evection envelope” (hereafter TEE) using Eqs. (C.49,C.33), which are more
computationally expensive but do not contain any inequality.
We reran the 105 [WD-WD]-[Star-Star] systems discussed in 4.4, but this time § we use the highest evection eccentricities to determine whether the mergers occur.
147 Figure 4.16: An illustration of evection. The inner orbit (in blue) and the mutual orbit (in orange) are both in the x y plane, with their angular momenta along − the +z direction. The perturber mB at the position shown in the figure exerts a tidal torque on the inner orbit, as shown by the yellow arrows, which decreases the inner eccentricity e1. As the perturber moves around, the tidal torque will change its direction according to the quadrant the perturber is in, and the change of e1 is shown at the four corners. During one period of the mutual orbit, the eccentricity e1 goes up and down twice.
148 The merger fraction estimated in this way is expected to be an upper limit, since in a
finite number of periastron passages the maximum of the evection envelope may not be sampled. When the dissipation effects are turned on, we use the secular eccentricities to calculate the dissipation rates. We also test the cases where we use the TEE eccentricities for the dissipation rate estimation, which does not affect our results significantly. Note that the expressions are derived for highly eccentric inner orbits, so we only turn on the evection calculation if either inner orbit when its eccentricity is very large (e.g., e1, e2 > 0.9 in our code).
Figure 4.17 shows the merger rates from quadruple systems and their equivalent triple cases, for different WD-WD masses. Comparing to the secular results in Figure
4.13, the merger fractions increase by small fractions for both quadruples and triples, and the enhancement from quadruple systems with respect to their equivalent triples
(or with 1M tertiary) are still large, i.e., 5 (or 7) times for the equal-mass WDs ∼ and 2.5 (4) times for the unequal-mass case. Note that the mergers from evection ∼ runs are mostly from Channel (I), and the rest are almost from Channel (III). Mergers from Channel (II) become very rare. This makes sense because systems that undergo orbital shrinking must be at very high eccentricities (i.e., with small orbital angular momentum), where the torque from the companion binary can more easily extract most of the inner orbital angular momentum and result in a Channel (I) collision.
However, we must note that the dominance of Channel (I) mergers does not mean that we have more “direct collision” events, because TEE is just a possible eccentric- ity maximum: in reality the eccentricity may not reach TEE value and the “direct collision” may be avoided. Although the exact merging channel is important to the
149 outcome (e.g., whether the interaction results in a SN Ia), we leave that discussion
for future work.
Finally, we must emphasize that our runs are based on the validity of secular calcu-
lations as a representation of the mean evolution, which limits our explorations to the
“highly hierarchical” cases, as we have assumed in the sampling of initial parameters
(i.e., a1, a2 rp/10). In this regime, we find that evection can modestly enhance the ≤ merger rates by a factor of 1.5 (compare the right panels of Figures 4.13 and 4.17). ∼ In more moderately hierarchical systems, evection could be much more important,
as suggested by e.g., Katz & Dong (2012) and Antognini et al. (2014). However,
to assess evection fully in less hierarchical systems, one would have to correct the
double-averaging equations or drop the outer orbit average entirely (e.g., Luo et al.,
2016b).
4.6 Discussion and Conclusion
Hierarchical quadruple systems are more complex than triple systems and can ex-
hibit qualitatively different behaviour on long timescales. Most interestingly, the frac-
tion of systems that can reach high eccentricities is significantly enhanced in quadruple
systems, with a correspondingly higher probability of producing WD-WD and stellar
mergers.
We have derived the secular equations for general hierarchical quadruple systems
up to octupole order, and shown that the fraction of reaching high-e is enhanced even
at quadrupole order. We have run the systems up to the age of the Universe and
found the event rate of reaching high eccentricities goes approximately as 1/t (Figure
4.7), consistent with current observations of the delay-time distribution of SNe Ia
150 m0 = m1 = 0.7M m0 = 0.8M , m1 = 0.6M 9 Quad 8 Quad, Channel (I) Equivalent Triple, mB = 2M 7 Triple, mB = 1M
6
] 5 % [
f 4
3
2
1
0 106 107 108 109 1010 106 107 108 109 1010 t [year] t [year]
Figure 4.17: The merger fractions from 105 random [WD-WD]-[Star-Star] systems (blue solid lines) and their equivalent triple cases (orange solid lines), with evection included, as described in 4.5. Results for triples with solar-mass tertiary are shown with green solid lines. With§ evection, the merger fractions are enhanced for both quadruple systems and triples, but the fraction from quadruples remains much larger than that from triples. Mergers from Channel (I) (blue dashed lines) now dominates over Channel (III), and Channel (II) becomes negligible. The runs for this figure is equivalent to those for Fig. 4.13 except for including evection.
151 (e.g., Maoz & Mannucci, 2012). We also found that for a given initial configuration, running through an extremely long time, a fraction of systems will never reach high eccentricities, and that these are initially mostly coplanar (Figures 4.10 and 4.11).
We have calculated the amplitude of eccentricity oscillations due to evection when the inner orbits are at high eccentricities ( 4.5 and Appendix C.4), and used it to § estimate the enhancement. This method is much faster than full N-body simulations but accurately describes the eccentricity“envelope”systems attained during successive eccentricity maxima.
4.6.1 Stellar quadruples
We have investigated the fraction of systems reaching high eccentricity in quadru- ple systems consisting of two pairs of solar-mass stars and compared our results to stellar triple systems over a large portion of parameter space.
With only quadrupole order effects turned on and our baseline distribution of ini- tial orbital elements, about 31% of quadruple systems reach high eccentricities within the age of universe (Figure 4.12), about 2.6 times higher than that from triples, indicating a high probability of producing close binaries, stellar mergers, or even blue stragglers (e.g., Mazeh & Shaham, 1979; Eggleton & Kiseleva-Eggleton, 2001;
Tokovinin et al., 2006; Perets & Fabrycky, 2009; Shappee & Thompson, 2013; An- tognini et al., 2014; Naoz & Fabrycky, 2014) in quadruple systems on long timescales.
We also ran quadruple systems with [1 + 0.5] + [1 + 0.5] M including octupole-order terms, and compared with both a set of octupole-order “equivalent” triple systems with 1.5 M tertiaries, and triple systems with 1 M tertiaries. The fraction of sys- tems reaching high eccentricity increased by 2 3% in all cases. ∼ −
152 Our results suggest a dynamical explanation for the observation that the ratio of quadruples to triples, as well as the ratio of triples to binaries, seems to be in excess among young stars (e.g., Correia et al., 2006; Chen et al., 2013), and that a large fraction of triples and quadruples are in tight binaries (e.g., Pribulla et al.,
2009). The anomalously high number of (relatively) high-mass stars in the thick disk observed by APOGEE could also be evidence of stellar mergers (Izzard et al., 2017).
A simple estimate for the role of quadruple systems in producing tight stellar binaries, mergers, or collisions proceeds as follows. The rate at which systems reach high eccentricities can be estimated as
Z t ˙ 0 ¯ −1 ˙ 0 0 Nhigh−e(t) = SFR(t )Msys fQuadfcutf(t t ) dt , 0 −
−1 ¯ where SFR is the star formation rate (units: M yr ), Msys 0.6M is the mean ∼ 30 mass of a stellar system , fQuad 0.03 is the fraction of systems that are quadruples ∼
(e.g., Raghavan et al., 2010); and fcut 1 is the fraction of quadruple systems of ∼ interest. Assuming a constant Milky Way star formation rate, the event rate is simply
˙ 10 ¯ Nhigh−e(t) SFR fQuadf(t < 10 yr)/Msys. Recent estimates of the Milky Way SFR ∼ × −1 range from around 0.68 4 M yr (e.g., Diehl et al., 2006; Misiriotis et al., 2006; − Murray & Rahman, 2010; Robitaille & Whitney, 2010; Kennicutt & Evans, 2012), which combined with our estimate of f(t < 1010yr) = 0.31 0.33 gives an event rate − from quadruple systems of 0.01 0.06 yr−1. Accounting for the relative frequency ∼ − of stellar quadruples and triples, the same calculation for triples gives an additional
0.01 0.06 yr−1, very similar to the quadruple contribution. In total, the rate is ∼ − 30The number fractions of single, binary, triple and quadruple systems are about 56%, 33%, 8% and 3% (Raghavan et al., 2010), giving an average of 1.58 stars per system, despite the dependence of multiplicity on masses (e.g., Gullikson et al., 2016; Moe & Di Stefano, 2017). Using the initial mass function Eq. (2) in Kroupa (2001), we obtain the average mass of stars M∗ 0.38M . h i ∼
153 about 0.02 0.12 yr−1, around 2 50% of the observed Galactic rate of bright stellar − − −1 mergers (MV 3), which is 0.24 1.1 yr (Kochanek et al., 2014). Considering ≥ − − that the SFR was higher in the past, and that we have only used f from two fixed-
mass configurations, our estimates for the quadruple and triple contributions could
be conservative. We conclude that it is plausible, but by no means certain, that triple
+ quadruple systems are an important channel for stellar mergers.
4.6.2 WD-WD binaries and Type Ia supernovae
We propose a new channel for producing Type Ia SNe: WD-WD mergers in hierar-
chical quadruple systems. Although selection effects make the observation of WD-WD
binaries in triple or quadruple systems difficult, the high multiplicity of A-type stars
indicates that it is common for WD-WD binaries to live in triple and quadruple sys-
tems (e.g., De Rosa et al., 2012, 2014). In the [WD+WD]+[star+star] case, we have
added GR and tidal precession and dissipation effects. We have sampled the systems
from a distribution of sizes and shapes, as well as random orientations. We find a
significantly enhanced merger rate, a factor of 3.5 10 higher than that in triple − systems (Figure 4.13), and a 1/t delay-time distribution for both quadruples and ∼ triples, consistent with that from observations (e.g., Horiuchi & Beacom, 2010; Maoz
et al., 2014). We classify the major type of mergers, i.e., those undergoing rapid
orbital shrinking, into 3 categories by their evolution patterns in phase space (Figure
4.15), and identify 8% of orbital shrinking mergers that experience a “precession ∼ oscillation” phase, whose underlying physics is explained in Appendix C.3 and Figure
C.12.
154 The secular merger rate from quadruple systems inferred from Figure 4.13 is ∼ 10−12 yr−1 Quad−1 at t = 1010 yr, which corresponds to a merger rate per unit of ˙ ¯ initial stellar mass of fQuadfcutf(t = 10Gyr)/Msys. We impose a cut based on a ∼ Kroupa initial mass function for the primary in each of the inner binaries, and uniform
distribution for secondary-to-primary mass ratio, and take the WD progenitor mass
31 range to be 1 8 M , leading to fcut 4.5% . Considering the rates from the − ∼ “equal-” and “unequal-mass” cases presented in 4.4 and 4.5, we obtain a WD-WD § −15 −1 −1 merger rate from quadruple systems of (2.7 5.3) 10 yr M . Including the − × −15 −1 −1 contribution of triples, we have a total merger rate (3 8) 10 yr M . This − × −14 −1 −1 can be compared to the observed SNe Ia rate at 10 Gyr, (1 5) 10 yr M − × (e.g., Maoz et al., 2014). We conclude that in an optimistic scenario, the quadruples
+ triples provide enough merging white dwarf pairs to explain of order half of the
SN Ia rate at long delay times. Indeed, our rate estimate is close to those from
traditional binary stellar synthesis (e.g., Ruiter et al., 2009). On the other hand,
it is unclear whether all of these mergers result in Ia supernovae, and at the more
pessimistic end of the rate calculation the rate is only 6% of the observed SN Ia rate.
Additionally, the lowest mass stars we consider are still on the main sequence and
hence unavailable for WD-WD mergers at short delay times. If fcut is an increasing
function of t, then this could spoil the 1/t delay-time distribution derived from ∝ dynamics. Also we have not taken into account the possible production of stellar
mergers before WDs are formed, which could lower fcut by 30% for quadruples and ∼ 10% for triples (estimated from Figure 4.12). In any case, it is noteworthy that the ∼ quadruples dominate over the triples in our calculation.
31Variations in the secondary mass distribution can result in a factor of 2 lower (e.g., Klein & Katz, 2017). ∼
155 We found that evection enhances the overall merger rate in quadruple systems by a modest factor of 1.5 (compare Figs. 4.13 and 4.17). However, it could play ∼ an important role in determining the branching ratio of different merging/collision channels, which may affect whether or not a SN Ia occurs and its observed properties.
Importantly, the estimate of the importance of evection in this paper involves bounds rather than a calculation of the full probability distribution of outcomes. A more careful and thorough treatment of evection is required to determine the exact final pathways of WD-WD mergers in quadruple and triple systems.
4.6.3 Future directions and outlook
Several problems in both dynamics and stellar astrophysics are left for future work. For the case of main sequence star mergers, we have only studied two fixed- mass configurations and only consider secular quadrupole + octupole order effects.
An exploration of the stellar mass distribution and inclusion of tidal effects will give a more accurate prediction of the high-e rate, while mass loss and mass transfer may be important when stars are close to each other and stellar merger rate is con- cerned. For the WD-WD merger case, in terms of dynamics, we need a more detailed treatment of non-secular effects, particularly evection, to explore the final stages of
WD-WD mergers driven by quadruple dynamics. The role of higher order effects, such as hexadecapole-monopole (see Will, 2017, for recent discussion in context of triple systems) and quadrupole-quadrupole interactions, is also unclear. In terms of astrophysics, this paper did not consider effects such as mass loss (which causes the dynamical characteristics of the system to change, see Perets & Kratter, 2012;
156 Shappee & Thompson, 2013) and interactions and common envelope evolution (im-
portant if a1 is small). A potential complication that involves both dynamics and
astrophysics is that mass loss will cause the orbital periods to change and many
quadruple systems will sweep over resonances between the orbital periods of the two
inner binaries before producing white dwarfs. Finally, these results may differ when
changing the initial distribution of orbital elements, e.g., exploring the moderately
hierarchical regime (a/a1 < 10) where secular codes such as ours are least applicable. ∼ Many of these same considerations are also relevant for stellar binaries.
There are several observational signatures that could test for a“quadruple channel”
of Type Ia supernovae. Most are related to similar signatures for triples. For example,
in historical SN Ia remnants, one can look for low-mass binaries inside, exhibiting blue-
shifted absorption in their spectra or anomalous abundances. Similarly, the discovery
of a pre-explosion main-sequence binary at the position of a nearby SN Ia would
provide confirmation of the quadruple nature of the system (see e.g., Thompson,
2011; Kochanek, 2009). For newly-discovered SNe Ia, one can look for two soft X-ray
5 4 flashes 10 s (a/10 AU) after the explosion, with a time separation 10 s (a2/AU), ∼ ∼ as the shock wave overtakes the binary companion (e.g., Kasen, 2010; Thompson,
2011). Very small changes to early-time optical/UV light curve might also signal the existence of a companion binary.
Due to the colour selection of WD-WD searches (e.g., Napiwotzki et al., 2001;
Badenes et al., 2009; Brown et al., 2010), it remains difficult to make a census of
WD binaries in triple or quadruple systems (e.g., Katz et al., 2014). In the future,
the population of Galactic WD-WD binaries driven to high eccentricities by their
companion stars or binaries may be detectable in LISA (Thompson, 2011; Gould,
157 2011; Amaro-Seoane et al., 2017), or in microlensing searches (e.g., WFIRST, Spergel et al., 2015). For now, radial velocity surveys of stars to identify the signal of a massive, but unseen companion, may be a robust way to explore the population of old and massive WDs in triple and quadruple systems (Thompson, 2011), and thus alleviate a primary uncertainty in the determination of the role of few-body systems in driving WD-WD mergers.
158 Chapter 5: Population of Eccentric Black Hole Systems
This chapter is based on an ongoing work and the original authors are: X. Fang,
T. Thompson, C. Hirata.
5.1 Introduction
Compact object mergers can be some of the most violent phenomena observable on cosmological scales, providing information of the growth of the Universe, the evolution of stars, as well as the history of chemical elements. Ground-based gravitational wave
(GW) experiments LIGO and VIRGO (Abbott et al., 2009; Accadia et al., 2012) have made the first discoveries of black hole (BH) mergers and neutron star (NS) mergers in recent years, and are expected to detect more such systems in future runs
(Abbott et al., 2016a,b, 2017a,d,b,c). Meanwhile, upcoming GW experiments, e.g.,
LISA (Amaro-Seoane et al., 2017), ALIA (Bender et al., 2013), DECIGO (Kawamura et al., 2006), BBO (Crowder & Cornish, 2005), Taiji (Gong et al., 2015), and TianQin
(Luo et al., 2016a) will focus on lower frequency ranges, hence different types and evolution stages of compact object binaries.
Current GW searches focus on circular orbit binaries, which are expected from binary evolution models and the circularization due to the GW emission. Although this condition is well satisfied before merging in the LIGO frequencies, it may not
159 be true for systems in lower frequencies where the formation channel of the compact
object binaries matters. In fact, two alternative channels have been suggested: (1)
hierarchical triple and quadruple systems (e.g., Miller & Hamilton, 2002; Wen, 2003;
Blaes et al., 2002; Thompson, 2011; Antonini & Perets, 2012; Naoz et al., 2013b;
Hoang et al., 2017; Antonini et al., 2016; VanLandingham et al., 2016; Antonini
et al., 2017; Fang et al., 2018; Hamers, 2018; Rodriguez & Antonini, 2018), and (2)
dynamical formation, including scattering in globular clusters (e.g., Rodriguez et al.,
2016; Chatterjee et al., 2017a,b). Both channels will produce binaries at very high
eccentricities (high-e), dissipating orbital energies and migrating to quasi-circular compact orbits. Similar to the scenario of migrating hot Jupiters studied in Socrates et al. (2012), assuming a steady state, these migration channels require the existence of a significant population of highly eccentric compact object binaries. The eccentric binary emits a GW pulse every time it passes its periastron, with a peak frequency twice its instantaneous orbital frequency (Gould, 2011).
Here, we consider the possibility that some fraction of the LIGO events are due to the hierarchical system dynamics. We ask about the ratio of the number of systems at each frequency for the population. We find for frequencies between 0.1 and 1 mHz that the eccentric channel dominates if the fraction of high-e is higher than 1% of the merger rate. We also find that many of these high-e systems will have orbital periods days to months, implying that they should be seen in our Galaxy if they exist and if we can isolate them.
The remainder of this paper is organized as follows. In 5.2, we compare the dis- § tributions of orbital properties predicted by the circular-orbit and the eccentric-orbit
channels, and show that there exists a range of frequencies where the population of
160 systems is distinct and might be interesting for upcoming low-f space GW interfer-
ometer experiments. In 5.3, we discuss the astrophysical implications of the possible § existence of a large population of eccentric binary BHs, and their detectability.
5.2 Circular vs. Eccentric
Assuming that the birth and death rates of binary BHs are in equilibrium, the
continuity equation yields different resulting distributions of the orbital properties for
different formation channels. Especially, the difference between circular and eccentric
orbits can be phenomenal at low frequencies of GWs. In this section, we use the
“equilibrium” assumption to derive the distributions of orbital elements of circular
and eccentric systems ( 5.2.1). Then, we show that a much larger population of § highly eccentric systems in the Universe are present at certain frequencies than that of
circular systems, if all the LIGO BH mergers come from the eccentric channel ( 5.2.2). § We also find that a lot of these eccentric systems will have GW peak frequencies in
LISA’s sensitive band, and orbital periods of order days or months, implying a good
chance to be detected in the future GW interferometers.
5.2.1 Equilibrium Distributions
Under the “equilibrium” assumption, the distribution of any quantity x is simply
given by the chain rule dN dN dt Γ = = , (5.1) dx dt dx x˙ where we have defined the steady “inflow” and “outflow” rate of systems as Γ.
For binary BH systems, the variations of orbital parameters depend on the GW emission, which are sensitive to their eccentricities. Consider a binary with masses
161 m1, m2, semi-major axis a, and eccentricity e. The time-averaged orbital shrinking
and circularization due to the GW emission are described by (Peters, 1964)
64 G3m m M 73 37 a˙ = 1 2 1 + e2 + e4 , (5.2) h i − 5 c5a3(1 e2)7/2 24 96 − 304 G3m m M 121 e˙ = e 1 2 1 + e2 , (5.3) h i − 15 c5a4(1 e2)5/2 304 − where M = m1 + m2, G is the Newton’s constant, and c is the speed of light.
For the circular binary case (e = 0), the frequency of the GWs, f, is set by the orbital period P , hence the semi-major axis a, i.e., f = 2/P = 2pGM/(4π2a3).
Combining it with Eq. (5.2), we obtain the time-averaged increase rate of the fre- quency f˙ 3 a˙ 96 G3m m M = = 1 2 . (5.4) f −2 a 5 c5a4 Suppose that all the LIGO black hole mergers are produced by circular binaries, there should exist a distribution of circular black hole binaries at each a and f. If the distribution is in equilibrium, then the distribution in frequency, dN/df, should be proportional to dt/df = 1/f˙, and it is normalized by the merger rate Γ, i.e., (as in e.g., Farmer & Phinney, 2003)
dN Γ 5Γ c5 4GM 4/3 = = f −11/3 . (5.5) ˙ 3 2 df circ f 96 G m1m2M 4π For the eccentric binary case (e > 0), the GW frequency varies with the period P , and
q 2 3 the peak frequency fp is set by the periastron rp = a(1 e), i.e., fp = 2 GM/(4π r ). − p ˙ The increase rate of fp is then fp/fp = 3r ˙p/(2rp). At the highly eccentric limit − (e 1), we have → 3 2 3 G m1m2M(192 112e + 168e + 47e ) r˙p = − − 15c5a3(1 e)3/2(1 + e)7/2 3 − 59 −7/2 G m1m2M 1 2 5 3/2 , (5.6) → − 3 c (arp)
162 and ˙ 3 fp 59 G m1m2M 1 = 9/2 5 3/2 . (5.7) fp 2 c (arp) rp The merger time for a high-e system can be estimated by
Z a0 da Z rp0 da tmerge tmigrat + tshrink + ' ∼ a˙ e=1 a˙ e=0 rp0 | | af | | 5 24√2 c 7/2 1/2 3 rp0 a0 , (5.8) ' 85 G m1m2M where the final orbital separation af satisfies af rp0 a0. Most of its time will be spent on the high-e orbit, when its rp is nearly invariant, hence its peak frequency.
Assume that all the LIGO BH mergers are produced by migration channels start- ing with the same a0 and rp0, and that the distribution of frequency is in equilibrium, then the distribution will have a significant peak near the initial GW peak frequency q 2 3 fp0 = 2 GM/(4π rp0). The indicated number of systems near this frequency for the eccentric case should be much higher than that for the circular case,
(dN/df) f˙ 96 a 3/2 R ecce = circ = 29/2 0,ecce . (5.9) f ˙ ≡ (dN/df)circ fp,ecce 295 rp0
6 For a0 = 100 au, rp0 = 0.02 au, the ratio Rf can be as high as 2.6 10 ! We run such × a system and confirm the analytic result, as shown in Figure 5.1.
In reality, there would be a joint distribution of (a0, rp0) provided by the exact
formation scenario, which determines the distribution of“Rf peaks”in the frequencies,
so that the averaged enhancement will be normalized to lower values at each frequency.
We will investigate this in the next subsection.
163 109 circ 107 ecce, = 100%
105
103
101 sys per galaxy mHz 10 1
ecce/circ, = 100% 105
f 3 R 10
101
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 f [mHz]
Figure 5.1: Comparison between eccentric and circular for a fixed initial configuration a0 = 100 au, rp0 = 0.02 au.
164 5.2.2 Eccentric System Population Simplest Case
For the simplest case, we assume the BH binaries are driven to high-e by the
Lidov-Kozai (LK) mechanism in hierarchical triple systems (Lidov, 1962; Kozai, 1962), and then become isolated after the migration started. We assume BH masses to be m1 = m2 = 30 M . The initial semi-major axes a are sampled from a log- uniform distribution between 10 and 1000 au, and the initial eccentricities are sampled from a thermal or uniform distribution (Jeans, 1919), i.e., e2 is uniform in [0,1]. An additional condition is imposed so that the initial periastron cannot be smaller than
20 times the Schwarzschild radius R1 = R2 90 km, or larger than rp0,max where ∼ the peak frequency is lower than 0.01 mHz. Systems at these low frequencies have lifetime much longer than the Hubble time and will be nearly static on the frequency band. Their distribution depends more on the formation channels and is harder to be constrained by the LIGO merger rate. We run 107 systems, in which about
2.73 106 systems merge within 10 Gyr. We show the cumulative fraction of mergers × versus time in Figure 5.2, which approaches a power-law with index 2/7. This is
2/7 because rp0 tmerge and rp0 is uniformly distributed at e 1 limit for any given ∝ → 2 a0, assuming a uniform distribution for e . This distribution of rp is equivalent to a uniform distribution of the angular momentum squared J 2, which is a natural consequence of stochastic angular momentum kicks due to the tertiary (e.g., Katz &
Dong, 2012).
In Figures 5.3 and 5.4, we show the initial distributions of semi-major axes a and periastrons rp for systems that merge within 10 Gyr (i.e., mergers) and that do not (i.e., non-mergers), respectively. We can see that most of mergers come from
165 N = 107 runs power-law, index 2/7
10 1
merger total fraction 10 2
103 104 105 106 107 108 109 1010 t [year]
Figure 5.2: The fraction of mergers grows with time, approximately as t2/7.
166 800000 merger, initial non-merger, initial 700000
600000
500000
400000 numbers
300000
200000
100000
0 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00
log10(a/AU)
Figure 5.3: The initial semi-major axis a distributions of the mergers and non-mergers in the 107 systems
systems that have small initial periastrons, which makes sense because of the strong dependence of the merger time on rp0 as shown in Eq. (5.8). We also show the distribution of the final eccentricities (i.e., when peak frequency reaches fp = 50 Hz) in
Figure 5.5. As expected, most of the systems will merge after being well-circularized.
Each system we run represent a population of systems that share the same initial parameters, which produce a steady distribution in the frequencies. At the end, we normalize the total frequency distribution according to the LIGO event rate 12-
213 Gpc−3yr−1 (e.g., Abbott et al., 2017a,d,b), which we have taken 50 Gpc−3yr−1.
167 1600000 merger, initial non-merger, initial 1400000
1200000
1000000
800000 numbers 600000
400000
200000
0 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0
log10(rp/AU)
Figure 5.4: The initial periastron rp distributions of the mergers and non-mergers in the 107 systems
168 2727099 mergers in total 1.000
0.999 f e
f o
n 0.998 o i t u b i r t s
i 0.997 d
e v i t a l
u 0.996 m u c
0.995
0.994 10 3 10 2 10 1 100
ef
Figure 5.5: The cumulative eccentricity distribution of the 2727099 mergers in the 107 systems when they reach peak frequency fp = 50 Hz. Only about 0.15% of systems have final eccentricities greater than 0.001, 0.1% greater than 0.01, and less than 0.01% greater than 0.1.
169 Since the Milky Way-size galaxy number density is roughly 0.01 Mpc, we have Γ ∼ 50 10−7 yr−1 per galaxy. In Figure 5.6, we show the distribution of systems in × the frequency range [0.1, 10] mHz, assuming all the LIGO BH mergers come from eccentric scenarios or circular scenario. The “LIGO mergers” we use to normalize the distribution are defined as the eccentric BHs that can merge within 10 Gyr and have
final eccentricities less than 0.1.
Although there could be a large population of eccentric BHs at mHz frequencies, their GW pulses only repeat once in one orbital period, depending on their semi-major axes a. For the BH systems we consider, when a > 11.5 au, the orbital period will exceed 5 years, which is roughly the operation timescale of the LISA mission. Pulses from those systems will be hard to recognize. Systems with orbital periods of days or months, however, will provide many repeated pulses during the entire mission. In
Figure 5.6, we also show the distributions of all eccentric BHs with orbital periods less than 1, 10, 100, 1000 days, as well as their ratios to the circular system distribution.
We expect the enhancement of population persist in different eccentric channels.
In particular, the population of systems with short orbital periods (days to months) should not be sensitive to the exact eccentric channels, because different channels mostly disagree on the initial semi-major axes a0 but the GW pulses are mostly con- tributed by systems with short orbital periods. Our results show that there could be hundreds of eccentric BHs, emitting GW pulses every few days with peak frequencies
0.1 mHz in our Galaxy. The population is about 10 times higher than that if all ∼ the systems are circular.
170 f
0 6
1 10 g
o All eBH l 5 10 r
e P<1d 4 p
10
y P<10d 10 x 3 10 yr
a 10 9 P<100d l tmerge 10 a
g 2 P<1000d 10 r 8
e 10 1 Circular p
10 7
s 10 5 0 10 m 10 6
e 10 t
s 1
y 10 s
102 f R 101
100 10 2 10 1 100 101 f [mHz]
Figure 5.6: Upper: The peak frequency distributions of eccentric BH binaries formed in hierarchical triple systems with orbital period P < 1, 10, 100, 1000 days, versus the distribution of circular BH binaries. We assume all the LIGO mergers are formed from this eccentric channel or the circular channel, and take the merger rate as Γ = 5 10−6yr−1 per Milky Way-size galaxy. Dashed lines show their corresponding merger × time tmerge. Lower: The ratio Rf of the distributions of eccentric systems to circular systems. A significant enhancement is seen in frequency range 0.1 to 1 mHz, in which the number of systems with orbital periods of days to months is expected to be of order 102 103, as shown in the upper panel. Note that the distributions at lower frequencies− are subject to modifications due to the actual criteria for decoupling the inner orbit from the outer.
171 Distributions from Triple Systems
Our previous choice of rr0,max is quite arbitrary, which introduces a lower cutoff
of the distributions at fp = 0.01 mHz. Following the assumption of equilibrium dis-
tribution, infinite number of systems would be predicted at infinitely low frequencies,
which is apparently unrealistic. The reason why there should be a lower bound of
frequency is that the distributions are derived by assuming isolated binaries, which is
only true when the inner orbit is decoupled from the outer.
In secular calculations where inner and outer orbits are both averaged over, the
relevant timescale is the instantaneous LK timescale (e.g., Antognini, 2015)
(ins) 2 t =√1 e tLK LK − 2 r 8√2 M Pout 2 3/2 rp 1 + (1 eout) , (5.10) ∼ 15π m3 P − a where tLK is the standard LK timescale, eout and Pout are the eccentricity and the orbital period of the outer orbit, respectively, and m3 is the tertiary mass.
The decoupling occurs when the instantaneous timescale is longer than the merger
2 3/2 3/2 time tmerge or the gravitational precession timescale tprec 2c a rp/[3(GM) ] ∼ ' 2 2(c/vp) P/(3π) , where vp is the maximum orbital velocity. Written in terms of typical
values of the triple systems, we obtain
3/2 1/3 p 2 (GW) 34 Gm1m2M aout 1 eout rp