Arxiv:2105.05960V2 [Gr-Qc] 3 Jun 2021 Posterior and Support for More Unequal Masses If We Adopt Their [16]

Adding harmonics to the interpretation of the black hole mergers of GWTC-1

Maite Mateu-Lucena,1 Sascha Husa,1 Marta Colleoni,1 Héctor Estellés,1 Cecilio García-Quirós,1 David Keitel,1 Maria de Lluc Planas,1 and Antoni Ramos-Buades2, 1 1Departament de Física, Universitat de les Illes Balears, IAC3 – IEEC, Crta. Valldemossa km 7.5, E-07122 Palma, Spain 2Max Plank Institut für Gravitationsphysik (Albert Einstein Institut), Am Mühlenberg 1, Potsdam, Germany (Dated: June 4, 2021) We consider the ten confidently detected gravitational wave signals in the GWTC-1 catalog [1] which are consistent with mergers of binary black hole systems, and re-analyze them with waveform models that contain subdominant spherical harmonic modes. This analysis is based on the current (fourth) generation of the IMR- Phenom family of phenomenological waveform models, which consists of the IMRPhenomX frequency-domain models [2–5] and the IMRPhenomT time-domain models [6–8]. We find overall consistent results, with all Jensen-Shannon divergences between the previous results using IMRPhenomPv2 and our default IMRPhenomX- PHM posterior results below 0.045 bits. Effects of subdominant harmonics are however visible for several events, and for GW170729 our new time domain IMRPhenomTPHM model provides the best fit and shifts the posterior +12.0 further toward more unequal masses and a higher primary mass of 57.3−10.9 solar masses at the lower end of the PISN mass gap. We validate our results with convergence tests and compare the posteriors for the three lowest mass events between two independent sampling codes, Bilby and LALInference. We also show how for GW151226 an insufficient sampling rate leads to a bi-modal posterior.

I. INTRODUCTION GWTC-1 which are consistent with mergers of BBH systems, using waveform models that contain subdominant spherical harmonic modes, the current (fourth) generation of the IM- A comprehensive analysis of gravitational wave (GW) events RPhenom family of inspiral–merger–ringdown phenomeno- detected in the first two observing runs of the advanced GW logical waveform models. These models consist of the IMR- detector network has been given in the gravitational wave tran- PhenomX sub-family of frequency-domain models [2–5], and sient catalog (GWTC-1) of compact binary coalescences (CBC) the IMRPhenomT time-domain models [6–8]. IMRPhenomX [1]. This analysis has been based on waveform models that constitutes a thorough upgrade of the previous generation of describe only the dominant quadrupole content of the signals, frequency-domain phenomenological waveform models based and assume quasi-circularity (i.e. orbital eccentricity is ne- on IMRPhenomD [20–26], which included the first models glected). More specifically, Bayesian parameter estimation has in the IMRPhenom family to describe subdominant harmon- been performed with the LALInference [9] code and the pos- ics (IMRPhenomHM) and both subdominant harmonics and terior distributions were constructed from a combination of precession (IMRPhenomPv3HM). IMRPhenomT offers advan- results with the IMRPhenomPv2 and SEOBNRv3 waveform tages in the description of precession, in particular concerning models (see TableI for a list of models relevant to our analysis). the merger and ringdown, as described in detail in [8] and The IMRPhenomPv2 posteriors have later been reproduced briefly summarized below. in [10] with the Bilby code [11] for GW parameter estimation, Our main results are obtained with the parallel Bilby [11, 27] using the nested sampling algorithm dynesty [12]. sampling code, in addition for the three lowest mass events we Some limited analysis with models that contain subdominant also cross-check with the independent LALInference code [9], harmonics has also been performed: Most notably the event and demonstrate good agreement.1 Our self-consistent results GW170729 has been studied with various waveform models are in tension with the results of [28] (for the GW151226 event, [13], some of which contain subdominant harmonics. For the where [28] finds a bi-modal posterior with support for very GWTC-1 catalog [1] all binary black hole (BBH) events have unequal masses), and with [29] ([28] and [29] are also mutually been cross-checked with the RapidPE algorithm [14, 15] and a in tension). We will discuss this comparison in AppendixA. For waveform bank composed of numerical relativity (NR) simula- GW151226 we can reproduce the finding of [28] of a bi-modal tions supplemented by waveforms from the NRSur7dq2 model arXiv:2105.05960v2 [gr-qc] 3 Jun 2021 posterior and support for more unequal masses if we adopt their [16]. The calibration region of NRSur7dq2 is restricted to choice of a lower sampling rate which implies cutting off the mass ratios between 1 and 5, and dimensionless spins up to 0.8. signal at the Nyquist frequency of 512 Hz. For RapidPE results no detailed discussion of results has been Our purpose is threefold: First, updating the waveform mod- provided, but the posterior samples have been released publicly els used for the analysis and adding subdominant harmonics [17]. A similar study has been performed also with the RIFT will allow us to sharpen parameter estimation results. Second, code [15, 18], which like RapidPE is based on a likelihood interpolation method, and a catalog of NR waveforms. A study of higher modes in GWTC-1 based on likelihood reweighting was presented in [19]. 1 We have added these cross-checks, which confirm our results, after [28] and Here we analyse the ten confidently detected signals in [29] appeared on the arXiv. 2 the computational efficiency of the models allows us to test the Parameter estimation for the GWTC-1 catalog paper [1] influence of variations in priors and sampler settings. Third, has used two precessing waveform models: IMRPhenomPv2 we will gain insight into effects of waveform systematics on the [22, 23] is a predecessor of our updated precessing phenomeno- posterior distributions and provide a “stress-test” for the new logical waveform models (IMRPhenomXPHM [5] and IMR- waveform model families. PhenomTPHM [8]), and is based on the non-precessing IM- The paper is organized as follows: We first present our meth- RPhenomD [20, 21] model. SEOBNRv3 [31] is based on the ods in Sec.II: our conventions, the observational data used, our non-precessing SEOBNRv2 [30] model and is a predecessor waveform models, and the setup for running parameter estima- of SEOBNRv4PHM [35], constructed within the effective-one- tion. In Sec. III we present our parameter estimation results, body (EOB) framework [30–36, 39–51], which is based on and we conclude in Sec.IV. Further details are presented in two solving a system of ordinary differential equations for the dy- appendices: In appendixA we present checks on the sampler namics of the binary. convergence, and compare results for the three lowest mass For the GW170729 event an analysis with higher mode mod- events between the Bilby and LALInference samplers. Finally els has already been performed [13], using the non-precessing in appendixB we test aggressive settings of our multibanding IMRPhenomHM [24] and SEOBNRv4HM_ROM [33, 34] algorithm to accelerate waveform evaluation. models, with which we will also compare. The latter is an example of the reduced-order models (ROMs) which are often used to accelerate the evaluation of EOB waveforms. Prelimi- II. METHODS nary analyses of GW170729 with our IMRPhenomXAS and IMRPhenomT models have already been presented in [3,7]. A. Notation and conventions We now turn to describing the new IMRPhenomX and IM- RPhenomT families. Phenomenological frequency-domain We generally report masses in the source frame, inferred models are constructed as piecewise closed-form expressions assuming a standard cosmology [37] (see Appendix B of that are calibrated to post-Newtonian (PN) or EOB inspiral [1]). Individual component masses are denoted by mi, and results and NR waveforms. Such explicit descriptions of the the total mass is MT = m1 + m2. The chirp mass is M = waveform can be evaluated very rapidly. The corner stone of the 3/5 −1/5 (m1 m2) MT , and we define mass ratios as q = m2/m1 ≤ 1 IMRPhenomX family is IMRPhenomXAS [2], which models and Q = m1/m2 ≥ 1. the (`, |m|) = (2, 2) modes of non-precessing signals, and has We also report two effective spin parameters which are com- been extended to include the (`, |m|) = (2, 1), (3, 3), (3, 2), (4, 4) monly used in waveform modelling and parameter estimation. modes by IMRPhenomXHM [3, 4]. All the modes have been The parameter χeff is defined as calibrated to around 500 numerical waveforms as reported in [4], whereas in the previous generation of IMRPhenom wave- m1 χ1 + m2 χ2 χ = , (2.1) forms only the dominant (`, |m|) = (2, 2) modes have been eff m + m 1 2 calibrated to 19 NR waveforms [20, 21]. where the χi are the projections of the dimensionless component IMRPhenomXPHM [5] augments IMRPhenomXHM to ac- spin vectors onto the orbital angular momentum. The effective count for spin precession following previous phenomenological spin precession parameter χp [38] is designed to capture the models [22, 52, 53] in employing an approximate map between dominant effect of precession, and corresponds to an approxi- non-precessing and precessing waveforms, where the signal mate average of the spin component in the precessing orbital of the precessing system in a non-inertial co-precessing frame plane over many precession cycles, see [38] and Eq. (4.7) in is identified with an appropriate non-precessing signal, and [5]. Both χeff and χp are dimensionless, and thus independent where the final spin is adjusted to account for the effect of spin of the frame (source or detector). components orthogonal to the instantaneous axis of the orbital When referring to specific spherical harmonic modes of plane. The non-precessing signal with adjusted final spin is then the GW signal we will always consider pairs of both positive mapped to a precessing signal by employing a time-dependent and negative modes, e.g. when we refer to the example list rotation from the non-inertial co-precessing frame to the inertial of multipoles (l, m) = (2, ±2), (2, ±1) we will use the notation frame where the signal is observed. This procedure is often re- (l, |m|) = (2, 2), (2, 1) or simply (2, 2), (2, 1). ferred to as “twisting up” in the literature. IMRPhenomXPHM implements two approximate descriptions for the Euler angles that define the time-dependent rotation: an orbit-averaged ef- B. Waveform models used fective single-spin description at the next-to-next-to-leading PN order (NNLO) [54, 55], or an approximation based on a For a complete list of all the waveform models relevant for multiple scale analysis of the PN equations [56], which is the the present paper see TableI. In this work we present new default choice. For a detailed discussion of the procedure and parameter estimation results which we obtained with the new the conventions employed to describe precessing waveforms in waveform models of the IMRPhenomX and IMRPhenomT the frequency domain see [5]. families, and compare with previous results from the GWTC-1 IMRPhenomXPHM also allows to choose among different catalog [1] and for GW170729 with [13]. approximations for the final spin of the remnant black hole. 3

Family Full name Precession Multipoles (`, |m|) included Ref. SEOBNRv2 × (2, 2) [30] SEOBNRv3 X (2, 2) [31] SEOBNR SEOBNRv4_ROM × (2, 2) [32] SEOBNRv4HM_ROM × (2, 2), (2,1), (3, 3), (4, 4), (5,5) [33, 34] SEOBNRv4P X (2, 2), (2, 1) [31, 35, 36] SEOBNRv4PHM X (2, 2), (2, 1), (3, 3), (4, 4), (5,5) [31, 35, 36] IMRPhenomD × (2, 2) [20, 21] IMRPhenomHM × (2, 2), (2, 1), (3, 3), (3, 2), (4,4), (4, 3) [24] IMRPhenom - 3rd Generation IMRPhenomPv2 X (2, 2) [22, 23] IMRPhenomPv3 X (2, 2) [25] IMRPhenomPv3HM X (2, 2), (2, 1), (3, 3), (3, 2),(4, 4), (4, 3) [26] IMRPhenomXAS × (2, 2) [2] IMRPhenomXHM × (2, 2), (2, 1), (3, 3), (3, 2), (4,4) [3,4] IMRPhenomX IMRPhenomXP X (2, 2) [5] IMRPhenomXPHM X (2, 2), (2, 1), (3, 3), (3, 2),(4, 4) [5] IMRPhenomT × (2, 2) [6,7] IMRPhenomTHM × (2, 2), (2, 1), (3, 3), (4,4), (5,5) [7] IMRPhenomT IMRPhenomTP X (2, 2) [6,8] IMRPhenomTPHM X (2, 2), (2, 1), (3, 3),(4, 4), (5,5) [8]

TABLE I. Waveform models used in this paper. For precessing models, the multipoles correspond to those in a co-precessing frame.

Version 0 is equivalent to the estimate implemented in IMR- IMRPhenomTPHM also offers an improvement of the in- PhenomPv2, where the in-plane spin contribution is captured spiral description by departing from the previous approach of by the effective precession spin χp. Version 1 is equivalent to using closed-form expressions for the Euler angles that describe version 0, with the replacement χp → χx. Versions 2 and 3 (our the precession of the orbital plane, and by default a numerical default) try in different ways to incorporate the full double-spin integration of the equations for the spin dynamics is used, which effects, with the latter relying on precession-averaged quantities can be carried out without significant increase of computational that arise in the context of the MSA formalism [56]. We refer cost [8]. For a more detailed discussion of the differences be- the reader to Sec. IV D of [5] for a thorough discussion of these tween IMRPhenomXPHM and IMRPhenomTPHM regarding options. the treatment of precession see [59].

IMRPhenomTHM [7] extends to the time domain the methods that have been used to construct the frequency-domain model IMRPhenomXHM and calibrate it to a catalog of nu- For lower mass events the gain in accuracy of describing merical waveforms. IMRPhenomTPHM [8] then extends this precession is however compensated by a lower accuracy for the model to precession, again following the “twisting up” ap- phase of the dominant harmonic, which is constructed based proach, but with some key improvements: First, in order to on a calibrated extension of the TaylorT3 PN approximant [60]. obtain explicit expressions for the spherical harmonic modes of The choice of this approximant was motivated by the explicit the precessing frequency-domain models, the stationary phase time dependence of the orbital frequency as a closed-form approximation (SPA) is used to compute approximate Fourier expression, but it has been shown [61] to be less accurate than transforms. This also makes it difficult to incorporate analytical other numerical PN approximants, and in consequence turns out knowledge about the ringdown frequencies in the ringdown to be more difficult to calibrate to NR. Resolving this weakness portion of a precessing waveform. This is however simple in will be the goal of future work on IMRPhenomT. the time domain, where the ringdown waveform can be simply modelled as a damped sinusoid, with the complex ringdown frequency determined by the final spin and mass. This allows a simple analytical approximation to the Euler angles during For IMRPhenomXPHM we have implemented the “multi- ringdown, based on the effective precessional motion observed banding” method [4, 62] to accelerate waveform evaluation, in NR signals [57], which can be derived analytically in the based on interpolation and the choice of an accuracy parameter. small opening angle limit. (A derivation is presented in [6], In appendixB we discuss parameter estimation results with a see also [58].) In consequence, the merger and ringdown of more aggressive choice of multibanding settings, as would be precessing waveforms is typically more accurately described appropriate for accelerated parameter estimation runs, e.g. for in our time-domain model. the initial exploration of newly detected events. 4

C. Methodology for Parameter estimation Event Trigger time Duration Low frequency Sampling rate GW150914 1126259462.4 8 20 2048 1. Data GW151012 1128678900.4 8 20 2048/4096 GW151226 1135136350.6 8 20 2048/4096 We use public GW strain data from the Gravitational Wave GW170104 1167559936.6 4 20 2048 GW170608 1180922494.5 16 20 2048/4096 Open Science Center (GWOSC) [63, 64], and power spec- ∗ GW170729 1185389807.3 4 20 2048 tral densities (PSDs) [65] and calibration uncertainties [66] GW170809∗ 1186302519.7 4 20 2048 included in the GWOSC release for GWTC-1. The Virgo de- GW170814∗ 1186741861.5 4 20 2048 tector officially joined the detector network on August 1, 2017. GW170818∗ 1187058327.1 4 16 2048 However, Virgo data for the event GW170729 is also publicly GW170823 1187529256.5 4 10 2048 available from GWOSC, and following the example of [1, 13] we have used Virgo data also to analyse this event, in addition to the later events, but with the exception of GW170823, which TABLE II. Data settings of each event from [1] where * indicates that the three LIGO and Virgo detectors were taking data and all three data was only observed by the LIGO and Virgo detectors. sets were used for our analysis. TableII lists the trigger times corresponding to the GW events, indicates the detector data sets we use for each event, the duration of data we analyze, the minimal frequency we (walks) and maximal (maxmcmc) number of Markov Chain use to compute the likelihood function (and also as a refer- Monte Carlo (MCMC) steps to 200 and 15000 respectively. ence frequency to define the GW phase and, for precessing We vary the number of nested sampling live points, setting systems, the component spins), and the sampling rate used. nlive = 512, 1024, 2048, and 4096 for the three lowest mass All of these parameters correspond to the choices made in [1]. events, and we vary the number of autocorrelation times to use The sampling rate used in [1] is 2048 Hz, corresponding to a before accepting a point, nact = 10, 30, 50, in order to study Nyquist frequency of 1024 Hz. We have used this as our default the convergence of results. As explained in appendixA, we setting to facilitate comparisons, together with GWOSC data choose nlive = 2048 and nact = 30 as the standard settings sets sampled at 4 KHz. For high-mass events this is sufficient. to report our main results. In order to speed up calculations we For the three low-mass events GW151012, GW151226 and use distance marginalization as described in [70]. GW170608 however, the |m| = 3, 4 harmonics can reach higher In this paper we only discuss the most relevant results, how- frequencies than 1024 Hz, and we have performed additional ever the complete posterior data sets are included in our Zenodo parameter estimation runs with a sampling rate of 4096 Hz data release [71]. and the GWOSC 16 kHz data sets. For these analyses we have computed PSDs with the BayesWave code [67, 68], which has been used by the LIGO and Virgo Collaborations (LVC) to 3. Priors produce the PSDs provided in the GWOSC data release. We have checked that we can reproduce the public PSDs when setting the sampling rate to 2048 Hz and using the 16 kHz data The intention of our prior choices is to be uninformative, sets, and then used the same settings to produce PSDs at twice however ambiguities arise when it is not possible to choose a the sampling rate, which agree well with the lower sampling natural quantity in which to prescribe a flat prior. For several rate PSDs up to approximately 1 kHz. We also extend the cal- quantities no problems arise: We use an isotropic prior for the ibration envelope files to 2048 Hz. In [69], we can find the sky location, the prior for the inclination angle is uniform in plots for the magnitude and phase quantities of the O2 events. its cosine, and the priors of the polarization angle and phase of For GW170608, we extracted the data from 20 Hz to 2048 Hz coalescence are uniform. For precessing spins we use uniform from these plots. However, for the O1 events such plots are distributions in the dimensionless component spin magnitudes not publicly available. For that reason, we performed a linear and isotropically distributed spin tilts, as also used in [1, 10]. extrapolation in frequency of the calibration uncertainties from We allow dimensionless component spin magnitudes up to 0.99. 1024 Hz to 2048 Hz with the public data available in [66]. The distance prior used for the GWTC-1 catalog [1] is uniform in luminosity distance volume (proportional to the luminosity distance squared), which distributes mergers uniformly 2. Bayesian sampling algorithms through a stationary Euclidean universe and does not take the expansion of the universe into account. The effect of the ex- For a our standard runs we use the parallel Bilby [11, 27] pansion of the universe can be neglected for small redshifts, implementation of the nested sampling algorithm dynesty however for larger redshifts it is more appropriate to use a [12]. In AppendixA we also compare with results obtained with prior that factors in our knowledge about cosmological ex- the LALInference code [9]. For both Bilby and LALInference pansion. We will use the simpler prior “uniform in luminos- we sample the component masses in terms of the mass ratio ity distance volume” for all events, however for the furthest and chirp mass. We also use the default settings of the Bilby events (GW170729 and GW170823) we also compare with a code apart from the following choices: we fix the minimal prior uniform in the comoving volume and source-frame time 5

(UniformComovingVolume prior in the Bilby code) using the III. RESULTS Planck2015 cosmology [37], following the Bilby catalog [10]. A. Summary of the black hole mergers in GWTC-1 Regarding the masses, the GWCT-1 catalog adopts the LAL- Inference [9] choice of a prior that is flat in the component masses. The Bilby re-analysis of GWTC-1 [10] uses a prior Table III contains a summary of our results using precess- that is flat in chirp mass and mass ratio, but then the samples ing waveform models, with error estimates obtained from 90% are re-weighted to the LALInference choice using the Jacobian posterior confidence intervals, including for comparison also given in Eq. (21) of [9]. We follow this approach here for our the results obtained for GWTC-1 [1] with IMRPhenomPv2. default runs for all events, and present posteriors re-weighted Our new results are in general within the error estimates of for a prior that is flat in component masses, consistent with the IMRPhenomPv2 results, as can be expected from the ini- [1, 10]. A very different choice has been used recently in [72] to tial analysis with subdominant harmonics performed for the re-analyse the signal GW190521, which is consistent with the GWTC-1 catalog, see appendix B of [1]: There all BBH events creation of an intermediate-mass black hole in a heavy black have been cross-checked with the RapidPE algorithm [14, 15] hole merger. Using a prior that is flat in mass ratio Q ≥ 1, and a waveform catalog of NR simulations supplemented by they found a multi-modal source-frame total mass posterior. waveforms from the NRSur7dq2 model [16]; samples were We analyse this event in a parallel paper [59] with improved released in [17]. NRSur7dq2 is however restricted to mass waveform models and confirm the multi-modality, although we ratios q ≤ 2 and dimensionless χi ≤ 0.8. A “modest influ- also find significant differences with respect to [72]. In order ence on the interpretation of observations” below the statistical to check whether similar multi-modalities appear for the most measurement uncertainty is found, and for GW170729 a Bayes massive events in GWTC-1, we rerun GW170729 with a prior factor of “approximately 1.4 for higher modes versus a pure that is flat in mass ratio Q ≥ 1 and detector-frame total mass quadrupole model”, which we compare with our results below. (a prior that is flat in source-frame masses adds complications However, the use of multi-mode waveforms is now standard to using distance marginalization, which we use to reduce the in GW data analysis, and it is thus interesting to update the computational cost). results of GWTC-1 to the methods used for O3 events, and to compare results. Differences between our default runs using We also select GW170729 (see Sec. III D 2) and the IMRPhenomXPHM and the IMRPhenomPv2 results from the three lowest-mass events (see appendixA) to compare GWTC-1 release are shown in Fig.1 in terms of the Jensen- results with the new prior implemented in Bilby that Shannon (JS) divergence, for a definition see e.g. Eq. (B1) applies the uniform prior in component masses while of [1]. The JS divergence takes values between 0 and 1 bits, the mass ratio and chirp mass are sampled (by using where 0 means that there is no difference between two distri- the two classes UniformInComponentsChirpMass and butions, and 1 means that both distributions have a maximum UniformInComponentsMassRatio in the Bilby code), which divergence. The posteriors for the model with higher modes directly corresponds to the behavior of the LALInference code. and precession are generally close to the results obtained with The limits of the mass and luminosity distance priors vary IMRPhenomPv2– all divergence values between both posterior between different events. We take these limits from the LALIn- distributions are smaller than 0.045, which is smaller than the ference configuration files of the runs from [1] and we check divergence between IMRPhenomPv2 and SEOBNRv3 runs that the inferred parameters do not rail against the prior bounds. performed in the GWTC-1 paper, where the largest value is for

GW151226 and the effective spin χeff, JS χeff = 0.14 bits, see Fig. 16 of [1]. Among the extrinsic parameters, the sky location (declina- tion δ and right ascension α) does not show significant changes, while the distance DL and the angle θJN between total angular momentum and the line of sight have the largest divergences. 4. Management of parameter estimation runs An overview of recovered values for the effective spin and mass ratio for the different events is shown in Fig.2. One of In total we have performed 364 runs with parallel Bilby [27]. the trends that can be observed (see also the detailed results in In order to simplify the management of runs, and to avoid Table III) is that the effective spins χeff inferred with IMRPhe- human errors when editing configuration files for each run, we nomXPHM have typically increased over the values inferred have developed a Python code to generate configuration files with IMRPhenomPv2, although they are still consistent within from simpler descriptions which we call pseudo-config files. the error estimates for all events. The event least following this These pseudo-config files allow us to loop over parameters trend is the highest-mass event GW170729, which is also one such as the waveform model, nlive or nact. Furthermore, of the two events where χeff has support only for positive values we use this Python code to submit several runs to the slurm (GW151226 being the other). [73] queuing system with one command, and to track version In TableIV, we show the matched-filter signal-to-noise ratios numbers of the codes we use when the jobs start running. (SNRs) obtained for runs with different models, as well as the 6

Event Approximant q m1/M m2/M MT /M M/M Mf /M χf χeﬀ χp DL/Mpc θJN /rad +0.11 +3.7 +2.3 +2.7 +1.2 +2.7 +0.12 +0.08 +0.38 +120 +0.27 IMRPhenomPv2 0.86−0.16 35.3−2.4 30.3−3.4 65.7−2.5 28.4−1.1 61.6−2.6 0.80−0.08 −0.03−0.09 0.39−0.27 450−140 2.72−0.52 +0.12 +3.7 +2.5 +2.8 +1.2 +2.7 +0.11 +0.08 +0.36 +140 +0.32 IMRPhenomXP 0.85−0.16 35.6−2.6 30.0−3.4 65.7−2.6 28.4−1.2 61.7−2.8 0.80−0.08 −0.02−0.11 0.42−0.29 440−140 2.63−0.54 GW150914 +0.11 +3.3 +2.3 +2.5 +1.1 +2.4 +0.10 +0.08 +0.34 +120 +0.24 IMRPhenomXPHM 0.87−0.15 35.0−2.3 30.2−3.2 65.2−2.4 28.2−1.1 61.2−2.6 0.81−0.08 −0.02−0.10 0.44−0.30 490−120 2.76−0.33 +0.10 +3.4 +2.1 +2.7 +1.1 +2.4 +0.10 +0.09 +0.34 +120 +0.23 IMRPhenomTPHM 0.88−0.15 35.0−2.1 30.7−2.9 65.7−2.3 28.4−1.0 61.7−2.4 0.80−0.08 −0.01−0.09 0.43−0.30 480−130 2.78−0.37 +0.30 +9.6 +3.2 +5.9 +1.2 +6.2 +0.13 +0.20 +0.35 +390 +0.84 IMRPhenomPv2 0.61−0.30 22.6−4.3 13.8−3.9 36.9−3.0 15.2−0.9 34.9−3.0 0.77−0.12 0.04−0.15 0.33−0.22 1090−400 1.99−1.64 +0.27 +8.9 +2.9 +5.4 +1.2 +5.6 +0.13 +0.21 +0.36 +400 +0.96 GW151012 IMRPhenomXP 0.65−0.31 22.0−3.8 14.1−3.8 36.6−2.8 15.2−0.9 34.4−2.8 0.79−0.11 0.05−0.14 0.38−0.25 1120−380 1.84−1.49 +0.32 +10.5 +3.5 +6.9 +1.5 +7.2 +0.13 +0.21 +0.37 +460 +1.08 IMRPhenomXPHM 0.57−0.28 24.2−5.1 13.6−4.0 38.3−3.7 15.5−1.1 36.2−3.7 0.79−0.12 0.12−0.17 0.36−0.23 1060−400 1.66−1.25 +0.30 +5.6 +1.8 +3.7 +0.3 +3.9 +0.12 +0.13 +0.32 +150 +1.66 IMRPhenomPv2 0.54−0.25 14.0−2.9 7.6−1.8 21.6−1.3 8.9−0.2 20.3−1.4 0.84−0.09 0.20−0.06 0.47−0.26 440−160 1.05−0.66 +0.30 +5.5 +1.7 +3.3 +0.3 +3.5 +0.11 +0.13 +0.32 +150 +1.71 GW151226 IMRPhenomXP 0.62−0.30 13.0−2.3 8.0−2.1 21.2−1.0 8.8−0.2 19.8−1.1 0.84−0.09 0.18−0.06 0.48−0.26 460−160 1.01−0.62 +0.34 +6.2 +2.0 +4.3 +0.2 +4.3 +0.14 +0.13 +0.28 +130 +1.92 IMRPhenomXPHM 0.52−0.25 14.2−3.2 7.4−1.9 21.7−1.4 8.8−0.2 20.3−1.4 0.88−0.11 0.20−0.07 0.57−0.31 470−160 0.79−0.44 +0.26 +5.7 +4.1 +4.1 +1.6 +4.1 +0.12 +0.13 +0.32 +350 +1.59 IMRPhenomPv2 0.63−0.19 31.2−4.8 19.8−3.7 51.1−3.3 21.4−1.4 48.2−3.3 0.79−0.09 −0.05−0.15 0.38−0.23 990−350 1.10−0.70 +0.25 +5.5 +4.0 +3.9 +1.6 +3.8 +0.11 +0.13 +0.33 +350 +1.51 GW170104 IMRPhenomXP 0.67−0.20 30.4−4.4 20.3−3.9 51.0−3.2 21.4−1.4 47.9−3.2 0.81−0.10 −0.02−0.15 0.43−0.27 1040−370 1.19−0.78 +0.22 +5.0 +3.6 +3.7 +1.7 +3.5 +0.11 +0.13 +0.33 +340 +1.87 IMRPhenomXPHM 0.71−0.22 29.3−3.7 20.8−4.1 50.1−3.0 21.2−1.4 47.1−2.9 0.81−0.09 −0.02−0.15 0.43−0.28 1120−390 0.89−0.57 +0.25 +3.9 +1.2 +2.1 +0.1 +2.3 +0.11 +0.14 +0.33 +100 +0.45 IMRPhenomPv2 0.68−0.29 11.0−1.6 7.6−1.7 18.7−0.6 7.9−0.1 17.6−0.7 0.78−0.08 0.04−0.05 0.34−0.22 330−100 2.42−1.95 +0.22 +3.4 +1.1 +1.7 +0.1 +1.8 +0.12 +0.11 +0.37 +100 +0.44 GW170608 IMRPhenomXP 0.72−0.29 10.8−1.4 7.7−1.6 18.6−0.5 7.9−0.1 17.5−0.6 0.77−0.07 0.04−0.04 0.34−0.22 330−100 2.43−1.96 +0.22 +3.2 +1.1 +1.5 +0.2 +1.6 +0.11 +0.11 +0.36 +100 +0.46 IMRPhenomXPHM 0.72−0.28 10.8−1.4 7.8−1.6 18.6−0.5 7.9−0.1 17.5−0.6 0.78−0.07 0.05−0.04 0.35−0.23 350−100 2.41−1.99 +0.28 +12.3 +7.7 +11.1 +4.6 +10.3 +0.08 +0.15 +0.28 +1060 +1.82 IMRPhenomPv2 0.63−0.22 51.2−9.1 32.4−7.7 84.0−8.7 35.0−3.7 78.3−7.9 0.87−0.11 0.36−0.21 0.44−0.24 2840−1070 0.93−0.64 +0.28 +11.4 +8.1 +10.5 +4.6 +9.8 +0.08 +0.15 +0.28 +1060 +1.82 IMRPhenomXP 0.62−0.21 51.7−9.4 31.8−7.4 83.7−8.4 34.7−3.6 77.9−7.7 0.89−0.10 0.35−0.19 0.50−0.27 2890−1070 0.94−0.65 GW170729 +0.29 +9.2 +8.4 +9.7 +4.6 +8.8 +0.09 +0.17 +0.33 +1230 +1.68 IMRPhenomXPHM 0.58−0.19 52.5−9.5 30.5−7.7 82.6−8.0 34.1−4.0 77.5−7.5 0.86−0.15 0.30−0.25 0.42−0.26 2750−1120 1.05−0.69 +0.27 +12.0 +8.8 +11.0 +4.5 +9.9 +0.10 +0.18 +0.29 +1160 +1.62 IMRPhenomTPHM 0.52−0.18 57.3−10.9 29.4−7.2 87.1−9.3 35.1−3.9 81.7−8.5 0.87−0.15 0.33−0.24 0.43−0.25 2490−880 1.08−0.71 +0.24 +6.9 +4.2 +4.4 +1.6 +4.3 +0.12 +0.14 +0.35 +240 +0.34 IMRPhenomPv2 0.68−0.20 34.9−5.1 23.7−4.1 58.8−3.2 24.8−1.3 55.3−3.2 0.80−0.09 0.06−0.12 0.38−0.25 1030−300 2.57−0.48 +0.24 +6.8 +4.1 +4.3 +1.7 +4.2 +0.12 +0.13 +0.36 +250 +0.37 IMRPhenomXP 0.69−0.21 34.8−5.0 23.9−4.3 58.9−3.3 24.9−1.3 55.3−3.2 0.81−0.10 0.08−0.13 0.44−0.29 1050−320 2.53−0.47 GW170809 +0.22 +6.5 +3.9 +3.9 +1.5 +3.7 +0.12 +0.14 +0.35 +240 +0.31 IMRPhenomXPHM 0.72−0.22 33.8−4.4 24.1−4.4 58.1−3.0 24.6−1.2 54.5−2.9 0.82−0.10 0.08−0.13 0.46−0.30 1130−270 2.64−0.41 +0.17 +5.7 +3.1 +3.7 +1.5 +3.5 +0.10 +0.14 +0.32 +240 +0.31 IMRPhenomTPHM 0.78−0.22 32.5−3.5 25.3−4.2 58.0−2.9 24.8−1.2 54.5−2.8 0.79−0.08 0.09−0.13 0.37−0.24 1090−290 2.64−0.43 +0.13 +4.1 +2.3 +2.7 +1.1 +2.5 +0.09 +0.10 +0.30 +120 +1.70 IMRPhenomPv2 0.84−0.19 30.5−2.4 25.4−3.1 56.0−2.2 24.2−0.9 52.3−2.2 0.85−0.11 0.07−0.09 0.57−0.36 590−160 0.69−0.39 +0.13 +3.9 +2.2 +2.6 +1.1 +2.5 +0.10 +0.10 +0.35 +130 +1.75 GW170814 IMRPhenomXP 0.83−0.18 30.3−2.4 25.2−3.1 55.5−2.1 23.9−0.9 52.1−2.1 0.80−0.08 0.06−0.09 0.40−0.27 610−200 0.68−0.42 +0.14 +4.0 +2.4 +2.5 +1.1 +2.4 +0.10 +0.10 +0.35 +140 +1.77 IMRPhenomXPHM 0.82−0.18 30.6−2.7 24.8−3.2 55.5−2.2 23.9−0.9 52.1−2.1 0.79−0.08 0.06−0.09 0.38−0.26 610−190 0.69−0.40 +0.13 +3.8 +2.3 +2.6 +1.1 +2.4 +0.10 +0.10 +0.33 +130 +1.68 IMRPhenomXPHM * 0.83−0.18 30.5−2.5 25.3−3.1 55.8−2.2 24.0−0.9 52.3−2.1 0.81−0.09 0.08−0.09 0.44−0.3 610−170 0.67−0.36 +0.18 +5.1 +2.8 +2.8 +1.0 +2.5 +0.09 +0.10 +0.28 +100 +1.80 IMRPhenomTPHM 0.77−0.19 31.8−3.2 24.6−3.4 56.7−2.4 24.3−0.9 52.8−2.1 0.86−0.11 0.12−0.10 0.58−0.35 630−140 0.60−0.32 +0.19 +5.5 +3.6 +3.7 +1.5 +3.8 +0.09 +0.14 +0.27 +310 +0.35 IMRPhenomPv2 0.75−0.20 35.4−3.9 26.5−4.2 62.0−3.1 26.5−1.3 57.8−3.1 0.85−0.11 −0.11−0.16 0.56−0.32 1050−280 2.44−0.35 +0.17 +5.8 +3.4 +3.9 +1.6 +3.8 +0.10 +0.14 +0.28 +320 +0.37 IMRPhenomXP 0.78−0.21 34.9−3.6 27.2−4.2 62.3−3.3 26.7−1.4 58.1−3.3 0.86−0.11 −0.05−0.16 0.59−0.36 1130−330 2.46−0.37 GW170818 +0.17 +5.3 +3.3 +3.7 +1.6 +3.6 +0.10 +0.13 +0.31 +320 +0.35 IMRPhenomXPHM 0.78−0.19 34.8−3.7 26.9−4.0 61.8−3.2 26.5−1.3 57.7−3.1 0.85−0.11 −0.05−0.16 0.55−0.34 1190−340 2.51−0.37 +0.19 +5.4 +3.6 +3.6 +1.6 +3.6 +0.09 +0.15 +0.28 +330 +0.35 IMRPhenomXPHM * 0.76−0.19 35.1−3.9 26.4−4.1 61.6−3.1 26.3−1.3 57.5−3.2 0.85−0.10 −0.09−0.17 0.57−0.32 1130−310 2.48−0.34 +0.18 +5.8 +3.4 +3.6 +1.5 +3.5 +0.10 +0.14 +0.30 +340 +0.36 IMRPhenomTPHM 0.77−0.20 35.1−3.8 26.9−4.0 62.3−3.1 26.6−1.3 58.3−3.1 0.84−0.10 −0.06−0.15 0.52−0.31 1090−320 2.47−0.36 +0.20 +8.2 +5.0 +7.7 +3.2 +7.2 +0.11 +0.16 +0.34 +650 +1.24 IMRPhenomPv2 0.74−0.23 39.3−5.5 28.9−5.7 68.2−5.7 29.0−2.5 63.9−5.4 0.83−0.10 0.06−0.16 0.47−0.30 1940−710 1.53−1.16 +0.21 +8.2 +5.1 +7.4 +3.2 +6.9 +0.11 +0.16 +0.33 +670 +1.06 IMRPhenomXP 0.73−0.25 39.2−5.6 28.4−6.4 67.5−5.8 28.6−2.6 63.1−5.5 0.84−0.11 0.06−0.17 0.52−0.33 1930−700 1.68−1.27 GW170823 +0.18 +7.2 +4.7 +6.9 +3.0 +6.4 +0.10 +0.16 +0.32 +650 +1.53 IMRPhenomXPHM 0.77−0.24 37.7−4.9 28.5−5.7 66.1−5.2 28.2−2.3 61.8−4.9 0.84−0.10 0.07−0.17 0.52−0.32 2170−720 1.28−0.96 +0.17 +7.2 +4.4 +7.0 +3.0 +6.5 +0.10 +0.17 +0.34 +640 +1.03 IMRPhenomTPHM 0.79−0.23 37.8−4.7 29.2−5.2 67.0−5.3 28.6−2.3 62.8−5.0 0.81−0.09 0.07−0.16 0.43−0.27 2050−700 1.80−1.49

TABLE III. Inferred source parameter values of the ﬁrst 10 BBH detections, given as posterior median values with 90% credible intervals. Masses correspond to the source frame. We compare the public IMRPhenomPv2 results [1] with those obtained with the new generation of phenomenological waveform models. IMRPhenomXP and the default version of IMRPhenomXPHM are listed for all events, IMRPhenomTPHM is included for high-mass events, and an alternative version of IMRPhenomXPHM (NNLO angles and ﬁnal spin version 2) is included for GW170814 and GW170818, indicated by a *.

Bayes factors between the signal and noise hypotheses for each In the rest of this section, we will discuss all ten BBH of these runs, and compare these Bayes factors against our events from GWTC-1 individually, grouping them into low- default IMRPhenomXPHM runs. Bayes factors and SNRs mass (GW151012, GW151226, GW170608), medium-mass are in general very consistent for different waveform models, (GW170104 and GW170814) and high-mass (GW150914, confirming the expectation that neither precession, higher mode GW170729, GW170809, GW170818, GW170823) events. effects, nor general waveform systematics effects can be clearly Note that we just make this distinction for convenience, and identified for any of the GWTC-1 events. the only events that clearly stand out in terms of total mass 7 are the two lightest ones (GW170608 and GW151226) and the TABLE IV. Network matched-filter SNRs with 90% credible intervals heaviest (GW170729). and log10 signal-to-noise Bayes factors for runs with the IMRPhenomX and IMRPhenomT families, where XPHM∗ denotes the version with NNLO Euler angles and final spin option 2. The last column shows the Bayes factor against the default IMRPhenomXPHM runs.

Event Approx. SNR log10 BF BF XPHM +0.1 +0.24 XAS 25.1−0.1 124.07 ± 0.07 0.99−0.19 +0.1 +0.24 XHM 25.1−0.1 124.08 ± 0.07 1.01−0.20 +0.1 +0.14 THM 25.1−0.1 123.83 ± 0.07 0.57−0.11 GW150914 +0.1 +0.23 XP 25.1−0.1 124.05 ± 0.07 0.95−0.19 +0.1 +0.15 TPHM 25.1−0.1 123.86 ± 0.07 0.61−0.12 +0.1 +0.24 XPHM 25.1−0.1 124.07 ± 0.07 1.00−0.20 +0.2 +0.12 XAS 9.2−0.3 10.13 ± 0.06 0.58−0.10 +0.3 +0.16 XHM 9.4−0.4 10.27 ± 0.06 0.81−0.14 +0.3 +0.2 THM 9.4−0.4 10.37 ± 0.06 1.02−0.17 GW151012 +0.2 +0.15 XP 9.2−0.3 10.25 ± 0.06 0.77−0.13 +0.3 +0.20 XPHM 9.3−0.4 10.36 ± 0.06 1.00−0.17 +0.2 +0.06 XAS 12.4−0.2 22.04 ± 0.06 0.26−0.05 +0.2 +0.06 XHM 12.4−0.3 22.03 ± 0.06 0.26−0.05 +0.2 +0.10 THM 12.4−0.3 22.22 ± 0.06 0.40−0.08 GW151226 +0.2 +0.23 XP 12.5−0.3 22.60 ± 0.07 0.95−0.18 +0.3 +0.24 XPHM 12.6−0.3 22.62 ± 0.07 1.00−0.20 +0.1 +0.17 XAS 13.9−0.2 32.10 ± 0.06 0.79−0.14 +0.1 +0.18 XHM 13.9−0.2 32.11 ± 0.06 0.81−0.14 +0.1 +0.12 THM 13.9−0.2 31.95 ± 0.06 0.57−0.10 GW170104 +0.1 +0.26 XP 13.9−0.2 32.28 ± 0.06 1.21−0.21 +0.2 +0.22 XPHM 14.0−0.2 32.20 ± 0.06 1.00−0.18 +0.1 +0.23 XAS 15.5−0.2 39.27 ± 0.07 0.93−0.18 +0.1 +0.21 XHM 15.5−0.2 39.24 ± 0.07 0.86−0.17 +0.1 +0.26 THM 15.5−0.2 39.33 ± 0.07 1.06−0.21 GW170608 +0.1 +0.24 XP 15.5−0.2 39.29 ± 0.07 0.96−0.19 +0.2 +0.25 XPHM 15.5−0.2 39.31 ± 0.07 1.00−0.20 +0.3 +0.14 XAS 10.7−0.3 15.97 ± 0.06 0.64−0.11 +0.3 +0.31 XHM 10.9−0.4 16.32 ± 0.06 1.47−0.26 +0.3 +0.34 THM 10.9−0.4 16.35 ± 0.06 1.56−0.28 GW170729 +0.3 +0.16 XP 10.7−0.3 16.03 ± 0.06 0.74−0.13 FIG. 1. Jensen-Shannon divergences between IMRPhenomPv2 [1] +0.3 +0.37 TPHM 10.9−0.4 16.39 ± 0.06 1.70−0.30 and IMRPhenomXPHM results of each event, which are ordered XPHM 10.8+0.3 16.16 ± 0.06 1.00+0.21 M −0.3 −0.18 from left to right with increasing chirp mass . Top panel: intrinsic +0.2 +0.21 XAS 12.6−0.2 24.12 ± 0.06 0.97−0.17 parameters. Bottom panel: extrinsic parameters. +0.2 +0.21 XHM 12.6−0.2 24.10 ± 0.06 0.93−0.17 +0.2 +0.15 THM 12.6−0.2 23.95 ± 0.06 0.66−0.12 GW170809 +0.2 +0.22 XP 12.6−0.2 24.13 ± 0.06 1.00−0.18 +0.2 +0.10 TPHM 12.6−0.2 23.79 ± 0.06 0.45−0.08 +0.2 +0.22 XPHM 12.6−0.3 24.13 ± 0.06 1.00−0.18 +0.1 +0.33 B. Low-mass events XAS 17.4−0.2 53.73 ± 0.07 1.38−0.27 +0.1 +0.30 XHM 17.5−0.2 53.69 ± 0.07 1.25−0.24 +0.1 +0.23 THM 17.4−0.2 53.57 ± 0.07 0.95−0.18 GW170814 +0.1 +0.26 For low-mass systems, the observable waveform in the de- XP 17.4−0.2 53.62 ± 0.07 1.06−0.21 +0.2 +0.85 TPHM 17.5−0.2 54.14 ± 0.07 3.48−0.68 tectors has a long duration due to its long inspiral phase. In +0.1 +0.48 XPHM* 17.5−0.2 53.89 ± 0.07 1.97−0.39 general, the chirp mass is the best measured intrinsic parame- +0.1 +0.24 XPHM 17.5−0.2 53.60 ± 0.07 1.00−0.20 ter, since it is the quantity that appears in the PN waveform at +0.2 +0.06 XAS 11.7−0.3 19.91 ± 0.06 0.25−0.05 leading order [60]. +0.2 +0.09 XHM 11.8−0.3 20.11 ± 0.06 0.39−0.07 In Fig.3, we compare several parameter estimation runs for THM 11.8+0.2 19.97 ± 0.06 0.29+0.07 GW170818 −0.3 −0.05 . +0.2 . ± . . +0.13 GW151012, GW151226 and GW170608 with the standard set- XP 11 9−0.3 20 27 0 07 0 57−0.11 +0.2 +0.15 TPHM 11.9−0.3 20.32 ± 0.06 0.63−0.12 tings mentioned in Sec. II C 2: using distance marginalization, +0.2 +0.39 XPHM* 12.0−0.3 20.73 ± 0.07 1.64−0.31 nlive = 2048 and nact = 30. We compare the whole IMR- +0.2 +0.24 XPHM 11.9−0.3 20.51 ± 0.06 1.00−0.19 PhenomX family (the aligned-spin, precessing, dominant and +0.1 +0.15 XAS 12.0−0.2 22.91 ± 0.05 0.78−0.13 +0.1 +0.15 subdominant mode models), the more recent aligned-spin time XHM 12.0−0.2 22.90 ± 0.06 0.77−0.13 +0.1 ± +0.12 domain model IMRPhenomTHM with higher-modes content, THM 12.0−0.2 22.81 0.06 0.63−0.10 GW170823 +0.1 +0.22 XP 12.0−0.2 23.07 ± 0.05 1.13−0.19 and the previous IMRPhenomPv2 results taken from the ﬁst +0.2 +0.10 TPHM 12.0−0.2 22.73 ± 0.06 0.52−0.09 catalog [1]. As mentioned in Sec. II C 1, for these low-mass +0.1 +0.20 XPHM 12.0−0.2 23.01 ± 0.06 1.00−0.17 events we use a sampling rate of 4096 Hz and the GWOSC 8

GW151226 GW170814 GW170818 +0.86 +0.18 +0.53 XP vs XAS 3.63−0.69 0.77−0.15 2.27−0.43 +0.92 +0.19 +0.60 XPHM vs XHM 3.85−0.74 0.80−0.16 2.54−0.48 +0.89 +0.52 TPHM vs THM - 3.66−0.71 2.22−0.42 +0.85 +0.15 TPHM vs XPHM - 3.48−0.68 0.63−0.12

TABLE V. Comparison of Bayes factors between precessing and aligned-spin IMRPhenomX and IMRPhenomT waveform models for GW151226, GW170814 and GW170818.

ratio posterior toward more unequal masses. Runs with our two different precession versions (NNLO and MSA angles) and with different final spin descriptions show very consistent posterior distributions and Bayes factors that are consistent within the error estimates.

2. GW151226

This event shows support only for positive values of χeff, and very mild support for precession, with a Bayes factor of +0.92 3.85−0.74 between IMRPhenomXHM and IMRPhenomXPHM in favor of precession, see TableV. Some differences are seen in Fig.3 regarding the mass ratio posterior, and in particular IMRPhenomTHM shifts the mass ratio support toward more FIG. 2. Violin plots of the IMRPhenomXPHM default results for equal masses, compared with the other models. Fig.4 also χeff and q, where the short horizontal lines indicate the 90% credible shows that adding higher mode content shifts the precession intervals. The events are sorted by the M values. Plots created with parameter χp toward larger values, but again all results are the PESummary code [74]. consistent within error estimates. See also AppendixA for further consistency tests and comparisons with other results on this event. 16 KHz data sets, and PSDs and calibration envelopes files which go up to 2048 Hz. All models show good agreement in total mass and χeff. GW151012 and GW151226 are the events with the largest uncertainty in mass ratio. 3. GW170608 We note that [28] (for GW151226) and [29] have reported results that are in tension with ours, including multi-modal posteriors. In order to further increase the confidence in our results GW170608 is the event with the lowest mass, the closest we have therefore also performed tests with a different sam- luminosity distance, and the loudest of the low-mass events. pling method in Bilby and with the independent LALInference Bayes factors for all waveform models are consistent within code for the three lowest mass events. These checks confirm the error estimates, and posterior distributions are very similar, our results as discussed in AppendixA. We also show how including for all versions of IMRPhenomXPHM which we lowering the sampling rate can result in bi-modal posteriors. have run (MLA and MSA Euler angles and the four final spin versions for each angle description).

1. GW151012

+0.3 C. Medium-mass events This is the low-mass event with the smallest SNR (9.3−0.4 for IMRPhenomXPHM) and Bayes factor, see TableIV. Bayes factor differences for different waveform models are within Fig.5 provides a comparison of posteriors for several key the error estimates, except for a small suppression of IMR- quantities for the two medium-mass events GW170104 and PhenomXAS and IMRPhenomXP in comparison to IMRPhe- GW170814. For these, all different models recover essentially +0.12 +0.15 nomXPHM of 0.58−0.10 and 0.77−0.13 respectively. As can be the same total mass and χeff posterior distributions. We discuss seen in Fig.3, the models with higher modes shift the mass further details below. 9

FIG. 3. Posterior distributions for total mass, mass ratio, eﬀective spin and luminosity distance for all three low-mass events and diﬀerent waveform models.

+0.12 pressed relative to IMRPhenomXPHM by 0.57−0.10, see Table IV.(IMRPhenomTPHM was only run for higher-mass events.) The higher-mode models shift the posterior marginally toward more equal masses and larger distance. We also ﬁnd that IM- RPhenomXPHM versions give consistent results regarding posterior distributions and Bayes factors, and we do not list these results separately but rather refer to our data release for full details.

2. GW170814

GW170814 is the second loudest BBH event with an SNR +0.1 of 17.5−0.2, and a well recovered sky position due to being observed by both LIGO (Hanford and Livingston) detectors and the Virgo detector. Indeed GW170814 was the ﬁrst BBH published as a coincident observation between the three LIGO- Virgo detectors [75].

FIG. 4. Joint posterior distributions for χp and χeff of GW151226, Apart from the effective precession parameter χp, the pos- estimated with IMRPhenomXP (pink), IMRPhenomXPHM (blue) and terior distributions for different waveform models are rather IMRPhenomPv2 (orange) [1]. Here and in similar figures throughout similar for other quantities, with only marginal differences for the paper, the central panel shows the 2D joint posterior with contours masses and mass ratio. Just as GW151226, and to a lesser marking 90% credible intervals, while the smaller panels on top and degree GW170818, GW170814 does however show marginal to the right show the corresponding 1D distributions for the individual support for precession, as expressed by the Bayes factors listed parameters, with the 90% credible interval indicated by the dashed in TableV, and it is an interesting case from the point of view lines. of waveform systematics. GW170814 shows the largest value of the JS divergence when comparing our IMRPhenomXPHM results to the GWTC-1 IMRPhenomPv2 results, correspond- 1. GW170104 ing to JS χp = 0.042 for the effective in-plane spin χp. The default version of IMRPhenomXPHM in fact recovers a lower This is the first event of the O2 observation run, and the value of χp than IMRPhenomPv2, see Table III. However it +0.2 fourth loudest BBH event at an SNR of 14.0−0.2 for all the turns out that when using the NNLO Euler angles rather than waveform models that we have considered. Bayes factors for the MSA angles for IMRPhenomXPHM, a larger value of +0.33 different models are consistent within error estimates, apart χp = 0.44−0.30 is recovered, and also a larger Bayes factor of +0.48 from the Bayes factor of IMRPhenomTHM, which is sup- 1.97−0.39 with respect to the default IMRPhenomXPHM ver- 10

FIG. 5. Posterior distributions for total mass, mass ratio, effective spin and luminosity distance are compared for the medium-mass events and different waveform models. sion (see Tables III andIV). Indeed both IMRPhenomXAS 2. GW170729 and IMRPhenomXHM recover marginally better Bayes factors than the default version of IMRPhenomXPHM. Also IMRPhe- This is the most massive and most distant BBH merger de- +0.85 nomTPHM has a higher Bayes factor, 3.48−0.68, and recovers a tected during O1 and O2, with substantial support for unequal +0.28 higher value of χp = 0.58−0.35, very close to the value for IMR- masses [1]. In consequence, parameter estimation with dif- PhenomPv2. We believe that this is likely to be a case where ferent waveform models with higher-mode content has been the MSA description of the precession Euler angles is less ac- performed in [13], and it is thus particularly interesting to com- curate. Comparisons of posterior distributions for the mass pare previous results with the IMRPhenomX and IMRPhe- ratio, total mass, χeff and χp for this event using our different nomT families. In addition, we test how posterior distributions precessing waveform models are shown in Fig.6. for this event change with different mass and distance priors. GW170729 is also interesting since its recovered effective spin is positive within error estimates. Unfortunately the SNR is rel- D. High-mass events +0.3 +0.3 atively low, at 10.8−0.3 for IMRPhenomXPHM and 10.9−0.4 for IMRPhenomTPHM. Only GW151012 has lower SNR within Fig.7 shows a comparison of posteriors for several key quanti- our set of ten events. Consequently, posterior distributions for ties. For high mass events, we also add the IMRPhenomTPHM GW170729 are relatively broad. We find that all eight ver- waveform model to our comparisons. sions of IMRPhenomXPHM yield Bayes factors and posterior distributions that are consistent within error estimates. The more massive component is situated at the lower end 1. GW150914 of the pair instability supernova (PISN) mass gap, which is placed at between approximately 52 and 133 solar masses by GW150914 was the first detection of a GW signal [76] and [78], see Table III, with the edges of the hypothetical mass has the highest SNR of events consistent with BBH mergers to gap however uncertain. See [79] for a more recent overview. date (in the O1, O2 [1] and O3a [77] observing runs). IMRPhenomTPHM, which provides the best fit to the data and Bayes factors in TableIV are consistent between different recovers the largest Bayes factor, recovers the highest value 12.0 models within error estimates, apart from IMRPhenomTHM for the larger mass, m1 = 57.3−10.9 M , potentially situating it and IMRPhenomTPHM, which are disfavored by factors of beyond the onset of the mass gap. +0.14 +0.15 0.57−0.11 (THM) and 0.61−0.12 (TPHM) with respect to the de- a. Higher-mode content: As demonstrated by the Bayes fault IMRPhenomXPHM version. Posterior results show only factors in TableVI, we find marginal evidence of higher-order very marginal deviations, see Fig.7 and Tables III,IV. The mode content in this event. Our Bayes factors in favor of their same holds for all the eight versions of IMRPhenomXPHM presence for non-precessing models are about half of the value we have tested (NNLO and MSA Euler angles and four final of 5.1 : 1 found in [13] for the model IMRPhenomHM against spin versions for each angle description): the default version IMRPhenomD, but larger than the value of 1.4 obtained in of IMRPhenomXPHM has the largest Bayes factor, and the [1] using the RapidPE method [14, 15] with a catalog of NR most disfavored version of IMRPhenomXPHM (NNLO angles and NRSur7dq2 [16] waveforms, where the mean value of the +0.17 with final spin version FS2) has a Bayes factor of 0.69−0.13 with mass ratio has been found to decrease from 0.66 to 0.61 when respect to the default version, which is still broadly consistent including higher modes. within the error estimates. Posteriors between different IMR- In Fig.8 we see that IMRPhenomPv2 and the corresponding PhenomXPHM versions are in general very similar and without newer precessing model for the dominant quadrupole, IMR- noteworthy changes. PhenomXP, give very similar results, while the non-precessing 11

FIG. 6. Comparison of posteriors for GW170814, indicating 90% credible intervals. Upper panel: total mass, mass ratio, and spin parameters for the MSA (blue) and NNLO (pink) versions of IMRPhenomXPHM and for IMRPhenomTPHM (green). Lower panel: distance and inclination (left) and sky position (right) are shown for the dominant-mode models IMRPhenomXAS, IMRPhenomXP (light and dark pink) and IMRPhenomPv2 (orange), and the multi-mode models IMRPhenomXHM, IMRPhenomXPHM (light and dark blue) and IMRPhenomTHM (green).

multi-mode waveform models recover lower mass ratio and Hypotheses Model properties PhenomX PhenomT/X higher luminosity distance, as was also observed in [13]. +0.48 +0.51 aligned 2.27−0.40 2.43−0.42 Adding not only higher modes but precession, we obtain more HM vs ` = 2 = |m| +0.29 +0.63 precessing 1.35−0.24 2.92−0.52 evidence for unequal masses for IMRPhenomTPHM, as seen in Fig.9. TABLE VI. Comparison of Bayes factors between IMRPhenomX In Figs.8 and9 we also directly compare the frequency and and IMRPhenomT families assuming the hypothesis that the signal time domain families IMRPhenomX and IMRPhenomT. We of GW170729 contains higher modes. The last column compares the can see that the distribution of likelihood values across the pos- 22-mode models of PhenomX against the HM models of PhenomT. terior for IMRPhenomTPHM is shifted toward larger values, +0.47 and indeed we obtain a Bayes factor of 2.16−0.39 in favor of IMRPhenomTPHM when compared to IMRPhenomXPHM, while the aligned-spin versions, IMRPhenomTHM and IMR- b. Comparison of diﬀerent mass and distance priors: PhenomXHM, give much more similar results. Given that GW170729 is the furthest event of the ﬁrst cata- 12

FIG. 7. Posteriors for total mass, mass ratio, χeﬀ and luminosity distance are shown for high-mass events and diﬀerent waveform models.

+0.2 log and has strong support for unequal masses, we compare tance, corresponding to only about half the SNR of 12.6−0.3 for the different distance and mass prior choices explained in Sec. IMRPhenomXPHM. Similar to GW150914, the Bayes factors II C 3, in order to see their influence on the posterior distribu- for the MSA-angle based versions of IMRPhenomXPHM, IMR- tions. For this study we run IMRPhenomXPHM using nlive PhenomXHM and IMRPhenomXAS are also consistent within = 2048 and nact = 10. Fig. 10 shows that there is a shift to error estimates, while the NNLO-based versions of IMRPhe- lower distance and masses when we change from the power- nomTHM and even more of IMRPhenomTPHM are disfavored. law distance prior to the prior that is uniform in comoving MSA versions recover slightly higher χp than the NNLO-based source-frame volume. However, varying the mass priors does versions, otherwise we find no visible differences in posteriors not affect the posteriors. In order to quantify how much the between IMRPhenomXPHM versions. posteriors change, we compute the JS divergence between the In Fig.7 one can see that the waveforms with subdomi- results. In Table VII we show, for every possible combination nant modes have more support for equal masses. This effect of results, which parameter has the highest divergence. If we is strongest for IMRPhenomTPHM, which is however disfa- compare the results that have different mass priors, all diver- vored by the Bayes factor. In addition, the higher-mode models gences are below 0.003, meaning that the posterior results only marginally shift the total mass to lower values, and the lumi- vary very minimally. Comparing different distance priors we nosity distance to higher values. obtain higher divergences, where the luminosity distance is the Using the RapidPE method with a catalog of numerical parameter with the largest values, JS dL = 0.015 and 0.012. higher-mode waveforms, [1] found a revised χeff distribution which is symmetric about a median value of zero, while our higher-mode models show no such effect. 3. GW170809

The posterior for this event is consistent with vanishing spins 4. GW170818 (both χeff and χp), and θJN is relatively well measured to correspond to a face-off orientation. The masses are slightly more un- GW170718 is a heavy event most consistent with a nega- − +0.14 equal than for GW150914 and the total mass is slightly smaller. tive effective spin, χeff = 0.11−0.16 for IMRPhenomPv2 [1]. The event is however located at approximately twice the dis- However, in our analysis, using the IMRPhenomX and IM- 13

FIG. 8. Posterior distributions of GW170729 for the total mass, mass ratio, luminosity distance and inclination using the dominant-mode models IMRPhenomPv2 (orange) [1] and IMRPhenomXP (light pink), the subdominant-mode models IMRPhenomHM (purple), SEOBNRv4HM (olive green) [13], IMRPhenomXHM (light blue) and IMRPhenomTHM (light green). All contours are at 90% credible intervals.

2 2 2 DL - Flat m1,2 priors Uni. Comov. DL - Flat m1,2 priors DL - Uni. Flat m1,2 priors DL - 1/q priors 2 DL - Flat m1,2 priors - JS DL = 0.0152 JS ra = 0.0029 JS M = 0.0009

Uni. Comov. DL - Flat m1,2 priors - JS DL = 0.0128 JS DL = 0.0119 2 DL - Uni. Flat m1,2 priors - JS ra = 0.0030 2 DL - 1/q priors -

TABLE VII. Maximum Jensen-Shannon (JS) divergence values between the posterior distributions for GW170729 estimated with IMRPhenomX- PHM using diﬀerent priors in distance (proportional to the luminosity distance squared, or uniform in the comoving volume and source-frame time) and masses (ﬂat in component masses, or inverse mass ratio).

RPhenomT families, the effective spin parameter is shifted posteriors obtained with different waveform models. toward zero, as can be seen in Fig. 12. Unfortunately this is Due to the large distance we compare two distance priors, also the third faintest event, which limits the information that uniform in comoving volume and proportional to the luminosity can be extracted, compared to other events. There is weak ev- distance squared, as we do in Sec. III D 2 for GW170729. As idence for precession, with a Bayes factor for XPHM against expected, we find that the differences between both priors are +0.60 XHM of 2.54−0.48, see TableV. smaller than for GW170729. The luminosity distance is the The IMRPhenomX and IMRPhenomT families show good parameter with the highest JS divergence when comparing the consistency in Figs.7 and 12, and a comparison of posteriors two distance prior choices; however, all divergence values are for different IMRPhenomXPHM versions with the results for lower than 0.007. The impact of the distance prior is very IMRPhenomPv2 is shown in Fig. 11. IMRPhenomXPHM minor, as illustrated by Fig. 13, where we compare the runs with NNLO Euler angles obtains a marginally higher Bayes with different priors all using IMRPhenomXPHM. 0.48 factor than the default version, by a factor 1.97−0.39 (see Table We find the Bayes factors between different IMRPhenomX- IV), and the corresponding posteriors for χeff and the luminosity PHM versions to be consistent within error estimates, and find distance are more similar to the IMRPhenomPv2 results. no notable differences in the respective posterior distributions for the source parameters.

5. GW170823 IV. CONCLUSIONS GW170823 is the second most massive and second most +0.1 distant event. The SNR is relatively low at 12.0−0.2 for IMR- In this paper we have obtained parameter estimation results PhenomXPHM, limiting the information that can be extracted. for the BBH events in GWTC-1 with the new fourth genera- Indeed, in Fig.7 we ﬁnd very good agreement between the tion of phenomenological waveform models. This has served 14

]

FIG. 9. Posterior distributions of GW170729 for the mass ratio, total mass, chirp mass, effective spin parameter, luminosity distance, inclination and log likelihood using the dominant-mode model IMRPhenomXP (light pink) and the subdominant-mode models IMRPhenomXHM (light blue), IMRPhenomXPHM (dark blue) and IMRPhenomTHM (light green), IMRPhenomTPHM (dark green). All contours are at 90% credible intervals. as a stress test before using these models for new events ob- RPhenomTPHM in [8], and here we briefly summarize the served in the third observation run, O3. Our results confirm results of those studies in the context of the GWTC-1 results. the expectations that subdominant harmonics and waveform GW190412 came from a binary with a total mass of approxi- systematics only have minor consequences for parameter esti- mately 45 solar masses at a mass ratio m1/m2 ≈ 3, and was the mation results for these ten events, and no major problems have first event where subdominant spherical harmonic modes could been encountered using the IMRPhenomX and IMRPhenomT be clearly identified. Our re-analysis shows excellent agreement families. between the latest generations of non-precessing waveform models, where subdominant harmonics are calibrated to nu- Conclusions need to be drawn considering also the results merical relativity simulations, and broad consistency between from our parallel analysis of selected events in the third observ- different precessing models with subdominant harmonics. In ing run O3. The present paper is complementary to two other particular we find good agreement between our IMRPhenomX publications [59, 80] dedicated to in-depth analyses of two and IMRPhenomT families. GW190521 is the shortest signal events detected in the first half of O3 [77, 81, 82]: GW190412 detected so far (only approximately 0.1 seconds long), and much [80] and GW190521 [59]. We have updated our results for more challenging: due to the lack of information in such a short GW190412 to include the precessing time domain model IM- 15

FIG. 10. Comparison of posterior distributions for GW170729 estimated with IMRPhenomXPHM using different distance and mass priors. Here the string PowerLaw denotes our standard power-law distance prior, UnifComovVo denotes a distance prior that is uniform in comoving volume, FlatCompMass runs are reweighted to a prior that is flat in component masses, and UniCompMass denotes the more recent method to directly ensure a flat prior in component masses while sampling, as described in Sec. II C 3. All contours are at 90% credible intervals.

FIG. 11. Comparison of posterior distributions of GW170818 for the mass ratio, the effective spin parameter, the luminosity distance and the inclination using the MSA (blues) and NNLO (pinks) IMRPhenomXPHM versions and IMRPhenomPv2 (orange) results from [1]. 90% credible intervals are indicated. signal, waveforms that correspond to very different sources fit gest that for lower-mass signals there is no simple preference the signal well, and the posterior distribution is multi-modal. for either the IMRPhenomX or IMRPhenomT family: The We can however show good consistency between different sam- IMRPhenomTHM and IMRPhenomTPHM models are typi- pling codes (LALInference [9] and Bilby [11, 27]) and choices cally characterized by lower Bayes factors when compared to of mass priors. We also confirm our expectation that for events their IMRPhenomX counterparts. The reason is likely that where only the merger and ringdown are observable, our de- for lower-mass events the higher accuracy of the phase of the scription of precession in the time domain (IMRPhenomTPHM (2, 2) mode for IMRPhenomX makes the frequency-domain within the IMRPhenomT family) improves significantly over model more accurate than the current time-domain version. our frequency-domain model (IMRPhenomXPHM). The exception is GW170814, which has a total mass of about Meanwhile, the GWTC-1 events studied in this paper sug- 56 solar masses, but does show some support for precession, 16

suggests that a comparison with IMRPhenomTPHM may provide further insight when support for spin precession is found. For higher masses preference tilts toward IMRPhenomT- PHM, as the accuracy of the phasing during inspiral is less relevant. For GW170729, which has a median value of the total mass in the detector frame above 120 solar masses +14.3 (122.7−15.7 M for the IMRPhenomXPHM default run) , both IMRPhenomXPHM and IMRPhenomTPHM provide consistent results, but support for IMRPhenomTPHM is marginally larger than for IMRPhenomXPHM. For GW170729 the better accuracy of the inspiral phasing for IMRPhenomX is much less relevant, while the treatment of precession in the time-domain model is superior. For GW170729 the advantage of IMRPhe- nomTPHM is however still rather small, while for GW190521 we ﬁnd [59] that IMRPhenomTPHM gives signiﬁcantly better results. For GW170729 we also note that IMRPhenomTPHM yields interesting results, see in particular Table III: the posterior

FIG. 12. Comparison of posterior distributions for χp and χeff of shifts toward larger mass ratio and larger primary mass m1, this GW170818, estimated with IMRPhenomTPHM (green), IMRPhe- shifting the primary mass toward the PISN mass gap. nomXP (pink), IMRPhenomXPHM (blue) and IMRPhenomPv2 (or- Finally, the computational efficiency of the IMRPhenomX ange, from [1]). 90% credible intervals are indicated. and IMRPhenomT models has allowed us to also systematically test the influence of variations in priors and sampler settings. We have verified that changes in the mass and distance priors only have very marginal influence on results. Perhaps most interestingly, we have checked that, choosing a prior that em- phasizes large mass ratios, we have not found any multi-modal posteriors, as for GW190521 [59, 72].

ACKNOWLEDGEMENTS

This work was supported by European Union FEDER funds, the Ministry of Science, Innovation and Universities and the Spanish Agencia Estatal de Investigación grants FPA2016- 76821-P, RED2018-102661-T, RED2018-102573-E, FPA2017- 90687-REDC, Vicepresidència i Conselleria d’Innovació, Re- cerca i Turisme, Conselleria d’Educació, i Universitats del Govern de les Illes Balears i Fons Social Europeu, Generali- tat Valenciana (PROMETEO/2019/071), EU COST Actions CA18108, CA17137, CA16214, and CA16104, and the Spanish FIG. 13. Comparison of the posterior distributions for GW170823 Ministry of Education, Culture and Sport grants FPU15/03344 estimated with IMRPhenomXPHM using different distance priors and FPU15/01319. M.C. acknowledges funding from the Eu- (proportional to the luminosity distance squared or uniform in the ropean Union’s Horizon 2020 research and innovation pro- comoving volume and source frame time). gramme, under the Marie Skłodowska-Curie grant agreement No. 751492 and from the Spanish Agencia Estatal de Inves- tigación, grant IJC2019-041385. D.K. is supported by the see the TableV of Bayes factors. In this case IMRPhenomT- Spanish Ministerio de Ciencia, Innovación y Universidades PHM is also favored over IMRPhenomXPHM with a Bayes (ref. BEAGAL 18/00148) and cofinanced by the Universitat +0.85 factor of 3.48−0.68. Given that the IMRPhenomXAS models are de les Illes Balears. The authors thankfully acknowledge the both computationally cheaper and simpler to use (time-domain computer resources at MareNostrum and the technical support models require a careful choice of start frequency to ensure provided by Barcelona Supercomputing Center (BSC) through consistent start times between different modes and a careful Grants No. AECT-2019-2-0010, AECT-2019-1-0022, from treatment of Fourier transforms when using matched filtering the Red Española de Supercomputación (RES). Authors also in the frequency domain), use of the IMRPhenomX family is acknowledge the computational resources at the cluster CIT pro- preferred for lower-mass events. The example of GW170814 vided by LIGO Laboratory and supported by National Science 17

Foundation Grants PHY-0757058 and PHY-0823459. This re- We first compare the results from our standard Bilby runs search has made use of data obtained from the Gravitational with alternative sampling choices: As previously shown in Wave Open Science Center [64], a service of LIGO Laboratory, Fig. 10 for GW170729, we again compare our standard the LIGO Scientific Collaboration and the Virgo Collabora- method of reweighting the standard-prior runs to a prior tion. LIGO is funded by the U.S. National Science Foundation. that is flat in component masses, against directly sampling Virgo is funded by the French Centre National de Recherche with the alternative flat-in-component-masses prior (using Scientifique (CNRS), the Italian Istituto Nazionale della Fisica the two Bilby classes UniformInComponentsChirpMass Nucleare (INFN) and the Dutch Nikhef, with contributions by and UniformInComponentsMassRatio, as discussed in Polish and Hungarian institutes. Sec. II C 3). Furthermore we compare with results for the LALInference code [9], which is part of the LALSuite [85] package for GW data analysis, using its implementation of Appendix A: Convergence test and comparison of sampling parallel-tempered Markov Chain Monte Carlo (MCMC) sam- methods pling. LALInference samples in mass ratio and chirp mass, re-weighting to a prior that is flat in component masses as In order to set up appropriate sampler settings, we performed described in [9]. We use essentially standard LALInference a convergence study for all the different events and waveform settings with eight temperatures, but a large number of 60 approximants. That convergence test consists in comparing sev- independent chains in order to have a broad distribution of eral sampler settings from low to high resolution in order to get chains covering the parameter space. These runs employ the the most converged result but with an affordable computational same dataset, PSDs and calibration envelopes as the Bilby runs, cost. In our case we test two different numbers of live points for discussed in Sec. III B, but we do not employ the distance the Nested Sampling algorithm, nlive = 512 and 2048, and marginalization used for our Bilby runs. also the number of autocorrelation times: nact = 10, 30, and In Fig. 15 we show a comparison between different runs for completeness we also test nact = 50 for IMRPhenomXHM for GW151226, employing two different sampling rates (our and IMRPhenomTHM. default 4096 Hz and 2048 Hz) and the two different mass For all the BBH events where we performed the convergence priors for Parallel Bilby, the standard LALInference run test, we get the same conclusion: default settings of nlive = as described above, and the results reported in [29] (re- 2048 and nact = 30 result in sufficiently accurate posteriors at sults data for [28] have not been released). We observe moderate computational cost. In Fig. 14 we show a comparison that while all our runs are consistent, employing the Bilby for GW170729 using different waveform models and sampler prior defined by the UniformInComponentsChirpMass and settings. Using nlive = 512 both the histograms and the 2D UniformInComponentsMassRatio classes (labelled “Uni- plots show fluctuations, indicating insufficient convergence. CompMass" in the plot) reproduces LALInference results best, However, when we increase nlive to 2048, the posterior distri- as expected. For this event we also observe that differences with butions are much smoother. Comparing the number of autocor- respect to the results reported in [29] are broadly consistent relation times when we use a large number of nlive, changing with the differences we observe between different sampling the nact value does not make an important difference. rates in our runs. We can get a more quantitative perspective from Table VIII, We also perform runs that aim to reproduce the bimodal where we show the maximum JS divergence values correspond- results of [28] with LALInference using their sampling rate of ing to the posteriors of Fig. 14. As commented above, the 1024 Hz. In Fig. 16 we can observe that indeed reducing our highest differences appear when we change from low to high sampling rate by a factor of four we find a bimodality in the nlive, but not when we change the nact configuration. We mass ratio distribution, and increased support for smaller mass decided to use nlive = 2048 and nact = 30 as the default ratio q (we can not show direct comparisons since the posterior setting because if we compare with the same run but using data of [28] have not been released). In this figure we also plot nact = 50, we obtain an indistinguishable distribution: the the results of a LALInference run with the standard 4096 Hz divergences between both posterior distributions are smaller sampling rate but with a prior restricted to q ≤ 0.4, aimed at than 0.001 bits. improving the sampling of more unequal mass ratios. We find In addition we perform further tests for the three lowest mass that this restricted run shows very good agreement with that events, motivated by the results in [28] and [29], which are in using the standard unrestricted prior, enhancing the posterior tension with our results. We note that the results of [29] have region corresponding to higher spins, but remaining consistent been obtained at half our sampling rate (i.e. only 2048 Hz, with the low-q portion of the standard run and not producing an according to the data release of [29]), and the results of [28] additional peak. We thus conclude that the bimodality found in have been obtained at a sampling rate of 1024 Hz [83, 84]. A [28] is likely due to the reduction of the higher-mode content reduced sampling rate implies a lower cutoff frequency corre- when a low sampling rate is used that will cut off part of the sponding to the Nyquist frequency, and will reduce the higher merger–ringdown signal. For a full explanation of the results for mode content. While the effect on the recovered SNR is very GW151226 found by other studies it would also be interesting small, we find a significant effect on the posterior distributions to compare results using the different PSD estimates employed, of some quantities, as we will discuss below. but the PSD used in [28] is not publicly available. (The PSD 18 used in [29] can in principle be reproduced from configuration not as accurate as runs without this acceleration, due to the files in their data release.) approximations used, we see that in our case the change of the For the other two low-mass events, similar consistency posterior distributions is very small. Thus, aggressive multi- is achieved between our different runs. In particular, for banding is particularly appropriate for quick exploratory runs, GW170608 our runs with a sampling rate of 4096 Hz show e.g. to tune prior bounds for masses or distance. excellent agreement, for both mass priors in Parallel Bilby and for LALInference, while for GW151012 small differences are visible between the standard prior Bilby run and the alternative prior and LALInference runs. However, differences between our different runs for this event are much smaller than differences with the results reported in [29], as can be observed in Fig. 17. Despite the small differences, the overall self-consistency of our results for GW151012 and GW170608 suggests that the results presented in [29] cannot be explained by different choices of the prior and sampling rate, and further examination of the convergence of those results and the influence of the choice of PSD estimation would be needed to understand the discrepancies.

Appendix B: Multibanding

As mentioned in Sec.IIB, in order to accelerate the evaluation of the waveform model, IMRPhenomXHM and IMRPhe- nomXPHM implement the multibanding interpolation method described for non-precessing systems in [4], and extended to precession in [5]. The method uses interpolation from an ap- propriately chosen unequally spaced coarse frequency grid to the equally spaced ﬁne grid used for data analysis. The grid spacing of the coarse grid is controlled by two threshold parameters, which are related to the local interpolation error for the phase and amplitude. The parameter controlling the accuracy of the non-precessing modes is called MB, and the parameter controlling the accuracy of the Euler angles evaluation used to construct the precessing waveform is PMB. We want to quantify the loss of accuracy and gain in computational speed for runs with a very aggressive multibanding threshold. To do that, we perform a comparison between the default version of IMRPhenomXPHM (MB = 10−3 and PMB = 0, i.e. no multibanding for the Euler angles) using nlive = 512, 2048 and nact = 10, 30 respectively, with two aggressive multibanding runs (MB = PMB = 0.1) using nlive = 512, 1024 and nact = 10. In Fig. 18 we show this comparison for GW170104, which has the lowest total mass from the medium- mass events group. Results for the other events are contained in our data release [71]. We ﬁnd, not surprisingly, that for the low resolution runs (nlive = 512 and nact = 10) convergence is poor with and without the aggressive multibanding threshold. However, when we increase the number of live points to 1024 and 2048, the posterior distributions become smoother even using aggressive multibanding, and agree well with the distributions obtained with less aggressive multibanding. Comparing the computational resources of the runs in TableIX, we can observe that with an aggressive multibanding threshold the cost of each likelihood evaluation is reduced by approximately half. Although using a more aggressive multibanding threshold is 19

FIG. 14. GW170729 convergence test comparing posterior distributions for the mass ratio, chirp mass, luminosity distance and inclination using IMRPhenomTHM, IMRPhenomXHM and IMRPhenomXPHM with diﬀerent sampler settings (nlive = 512, 2048 and nact=10,30,50) and a 90% credible interval. 20

N512 NA10 N512 NA30 N2048 NA10 N2048 NA30 N2048 NA50

N512 NA10 - JS t1 = 0.006 JS t2 = 0.020 JS t2 = 0.019 JS t2 = 0.019

N512 NA30 - JS t2 = 0.023 JS t2 = 0.021 JS t1 = 0.024 IMRPhenomTHM N2048 NA10 - JS ra = 0.001 JS ra = 0.001

N2048 NA30 - JS t1 = 0.001 N2048 NA50 - N512 NA10 N512 NA30 N2048 NA10 N2048 NA30 N2048 NA50

N512 NA10 - JS ra = 0.007 JS t2 = 0.019 JS ra = 0.024 JS ra = 0.025

N512 NA30 - JS t2 = 0.023 JS t2 = 0.023 JS t2 = 0.023 IMRPhenomXHM N2048 NA10 - JS ra = 0.005 JS ra = 0.004 N2048 NA30 - JS ra = 0.001 N2048 NA50 - N512 NA10 N512 NA30 N2048 NA10 N2048 NA30

N512 NA10 - JS ra = 0.021 JS ra = 0.039 JS ra = 0.038 N512 NA30 - JS = 0.012 JS = 0.009 IMRPhenomXPHM ra ra N2048 NA10 - JS ra = 0.001 N2048 NA30 -

TABLE VIII. Maximum Jensen-Shannon (JS) divergence values between the posterior distributions for GW170729 estimated with IMRPhe- nomTHM, IMRPhenomXHM and IMRPhenomXPHM using diﬀerent sampler settings, where N is short for nlive and NA is short for nact.

FIG. 15. Comparison of the recovered posterior distributions for GW151226 for diﬀerent values of the sampling rate and mass priors in Parallel Bilby, a standard run in LALInference, and the results reported for this event in [29] (denoted as 3-OGC in the legend). 21

FIG. 16. Comparison of the recovered posterior distributions for GW151226 in LALInference for two diﬀerent values of the sampling rate (1024 and 4096 Hz) and two diﬀerent mass-ratio prior choices (the default one and one restricted to q ≤ 0.4). The mass-ratio histograms have been normalized to equal height for ease of visibility.

nlive nact MB PMB CPU h L. eval. Cost/L. eval. [ms] 2048 30 10−3 0 1765 2.24 × 108 28.30 512 10 10−3 0 154 1.67 × 107 33.11 1024 10 10−1 10−1 157 3.46 × 107 16.34 512 10 10−1 10−1 77 1.71 × 107 16.32

FIG. 17. Comparison of the recovered posterior distributions for TABLE IX. Computational cost comparison between runs on GW151012 for diﬀerent values of the sampling rate and mass priors in GW170104 using IMRPhenomXPHM with diﬀerent multibanding Parallel Bilby, a standard run in LALInference, and the results reported thresholds and sampler settings, where MB and PMB correspond to for this event in [29] (denoted as 3-OGC in the legend). the multibanding thresholds of the aligned-spin modes and the Euler angles evaluation, respectively. The number of likelihood evaluations and the mean cost of each evaluation in ms are also shown. 22

FIG. 18. Comparison of the GW170104 posterior distributions for the total mass, mass ratio and eﬀective spin obtained with IMRPhenomXPHM using diﬀerent multibanding thresholds and sampler settings nlive=N, and nact=NA, and multibanding thresholds XHMMB for the aligned-spin modes and XPMB for the Euler angles. 23

[1] B. P. Abbott et al. (LIGO Scientific, Virgo), Phys. Rev. X9, S. Husa, X. Jiménez-Forteza, C. Kalaghatgi, F. Ohme, and F. Pan- 031040 (2019), arXiv:1811.12907 [astro-ph.HE]. narale, Phys. Rev. Lett. 120, 161102 (2018), arXiv:1708.00404 [2] G. Pratten, S. Husa, C. Garcia-Quiros, M. Colleoni, A. Ramos- [gr-qc]. Buades, H. Estelles, and R. Jaume, ArXiv e-prints (2020), [25] S. Khan, K. Chatziioannou, M. Hannam, and F. Ohme, Phys. arXiv:2001.11412 [gr-qc]. Rev. D 100, 024059 (2019), arXiv:1809.10113 [gr-qc]. [3] C. García-Quirós, M. Colleoni, S. Husa, H. Estellés, G. Pratten, [26] S. Khan, F. Ohme, K. Chatziioannou, and M. Hannam, Phys. A. Ramos-Buades, M. Mateu-Lucena, and R. Jaume, Phys. Rev. Rev. D 101, 024056 (2020), arXiv:1911.06050 [gr-qc]. D 102, 064002 (2020), arXiv:2001.10914 [gr-qc]. [27] R. J. E. Smith, G. Ashton, A. Vajpeyi, and C. Talbot, Monthly No- [4] C. García-Quirós, S. Husa, M. Mateu-Lucena, and A. Borchers, tices of the Royal Astronomical Society 498, 4492–4502 (2020). Class. Quant. Grav. 38, 015006 (2021), arXiv:2001.10897 [gr- [28] H. S. Chia, S. Olsen, J. Roulet, L. Dai, T. Venumadhav, B. Zackay, qc]. and M. Zaldarriaga, arXiv e-prints (2021), arXiv:2105.06486 [5] G. Pratten et al., arXiv e-prints (2020), arXiv:2004.06503 [gr- [astro-ph.HE]. qc]. [29] A. H. Nitz, C. D. Capano, S. Kumar, Y.-F. Wang, S. Kastha, [6] H. Estellés, A. Ramos-Buades, S. Husa, C. García-Quirós, M. Schäfer, R. Dhurkunde, and M. Cabero, arXiv e-prints M. Colleoni, L. Haegel, and R. Jaume, arXiv e-prints (2020), (2021), arXiv:2105.09151 [astro-ph.HE]. arXiv:2004.08302 [gr-qc]. [30] A. Taracchini et al., Phys. Rev. D 89, 061502 (2014), [7] H. Estellés, S. Husa, M. Colleoni, D. Keitel, M. Mateu-Lucena, arXiv:1311.2544 [gr-qc]. C. García-Quirós, A. Ramos-Buades, and A. Borchers, arXiv [31] S. Babak, A. Taracchini, and A. Buonanno, Phys. Rev. D 95, e-prints (2020), arXiv:2012.11923 [gr-qc]. 024010 (2017), arXiv:1607.05661 [gr-qc]. [8] H. Estellés, M. Colleoni, C. García-Quirós, S. Husa, D. Keitel, [32] A. Bohé et al., Phys. Rev. D 95, 044028 (2017), M. Mateu-Lucena, M. L. Planas, and A. Ramos-Buades, arXiv arXiv:1611.03703 [gr-qc]. e-prints (2021), arXiv:2105.05872 [gr-qc]. [33] R. Cotesta, A. Buonanno, A. Bohé, A. Taracchini, I. Hinder, and [9] J. Veitch et al., Phys. Rev. D 91, 042003 (2015), arXiv:1409.7215 S. Ossokine, Phys. Rev. D 98, 084028 (2018), arXiv:1803.10701 [gr-qc]. [gr-qc]. [10] I. M. Romero-Shaw et al., Mon. Not. Roy. Astron. Soc. 499, 3295 [34] R. Cotesta, S. Marsat, and M. Pürrer, Phys. Rev. D 101, 124040 (2020), arXiv:2006.00714 [astro-ph.IM]. (2020), arXiv:2003.12079 [gr-qc]. [11] G. Ashton et al., Astrophys. J. Suppl. 241, 27 (2019), [35] S. Ossokine et al., Phys. Rev. D 102, 044055 (2020), arXiv:1811.02042 [astro-ph.IM]. arXiv:2004.09442 [gr-qc]. [12] J. S. Speagle, Mon. Not. Roy. Astron. Soc. 493, 3132–3158 [36] Y. Pan, A. Buonanno, A. Taracchini, L. E. Kidder, A. H. Mroué, (2020). H. P. Pfeiffer, M. A. Scheel, and B. Szilágyi, Phys. Rev. D 89, [13] K. Chatziioannou et al., Phys. Rev. D 100, 104015 (2019), 084006 (2014), arXiv:1307.6232 [gr-qc]. arXiv:1903.06742 [gr-qc]. [37] P. A. R. Ade, N. Aghanim, M. Arnaud, M. Ashdown, J. Au- [14] C. Pankow, P. Brady, E. Ochsner, and R. O’Shaughnessy, Phys. mont, C. Baccigalupi, A. J. Banday, R. B. Barreiro, J. G. Rev. D 92, 023002 (2015), arXiv:1502.04370 [gr-qc]. Bartlett, et al. (Planck Collaboration), A&A 594, A13 (2016), [15] J. Lange, R. O’Shaughnessy, and M. Rizzo, arXiv e-prints arXiv:1502.01589 [astro-ph.CO]. (2018), arXiv:1805.10457 [gr-qc]. [38] P. Schmidt, F. Ohme, and M. Hannam, Phys. Rev. D 91, 024043 [16] J. Blackman, S. E. Field, M. A. Scheel, C. R. Galley, C. D. Ott, (2015), arXiv:1408.1810 [gr-qc]. M. Boyle, L. E. Kidder, H. P. Pfeiffer, and B. Szilágyi, Phys. Rev. [39] S. Akcay, S. Bernuzzi, F. Messina, A. Nagar, N. Ortiz, and D 96, 024058 (2017), arXiv:1705.07089 [gr-qc]. P. Rettegno, Phys. Rev. D 99, 044051 (2019), arXiv:1812.02744 [17] B. P. Abbott et al. (LIGO Scientific, Virgo), Supplementary [gr-qc]. parameter estimation sample release for GWTC-1: Comparisons [40] T. Damour and A. Nagar, Lect. Notes Phys. 905, 273 (2016). to NR, Tech. Rep. LIGO-P1900124 (LIGO Project, 2019). [41] T. Damour, P. Jaranowski, and G. Schäfer, Phys. Rev. D 91, [18] J. Healy, C. O. Lousto, J. Lange, and R. O’Shaughnessy, Phys. 084024 (2015), arXiv:1502.07245 [gr-qc]. Rev. D 102, 124053 (2020), arXiv:2010.00108 [gr-qc]. [42] T. Damour, A. Nagar, and S. Bernuzzi, Phys. Rev. D 87, 084035 [19] E. Payne, C. Talbot, and E. Thrane, Phys. Rev. D 100, 123017 (2013), arXiv:1212.4357 [gr-qc]. (2019), arXiv:1905.05477 [astro-ph.IM]. [43] A. Nagar, F. Messina, P. Rettegno, D. Bini, T. Damour, A. Geral- [20] S. Husa, S. Khan, M. Hannam, M. Pürrer, F. Ohme, ico, S. Akcay, and S. Bernuzzi, Phys. Rev. D 99, 044007 (2019), X. Jiménez Forteza, and A. Bohé, Phys. Rev. D 93, 044006 arXiv:1812.07923 [gr-qc]. (2016), arXiv:1508.07250 [gr-qc]. [44] A. Nagar et al., Phys. Rev. D 98, 104052 (2018), [21] S. Khan, S. Husa, M. Hannam, F. Ohme, M. Pürrer, arXiv:1806.01772 [gr-qc]. X. Jiménez Forteza, and A. Bohé, Phys. Rev. D 93, 044007 [45] A. Nagar and P. Rettegno, Phys. Rev. D 99, 021501 (2019), (2016), arXiv:1508.07253 [gr-qc]. arXiv:1805.03891 [gr-qc]. [22] M. Hannam, P. Schmidt, A. Bohé, L. Haegel, S. Husa, et al., [46] A. Nagar, G. Riemenschneider, and G. Pratten, Phys. Rev. D 96, Phys.Rev.Lett. 113, 151101 (2014), arXiv:1308.3271 [gr-qc]. 084045 (2017), arXiv:1703.06814 [gr-qc]. [23] A. Bohé, M. Hannam, S. Husa, F. Ohme, M. Puerrer, and [47] W. Del Pozzo and A. Nagar, Phys. Rev. D 95, 124034 (2017), P. Schmidt, PhenomPv2 - Technical Notes for LAL Implementa- arXiv:1606.03952 [gr-qc]. tion, Tech. Rep. LIGO-T1500602 (LIGO Project, 2016). [48] A. Nagar, T. Damour, C. Reisswig, and D. Pollney, Phys. Rev. [24] L. London, S. Khan, E. Fauchon-Jones, C. García, M. Hannam, D 93, 044046 (2016), arXiv:1506.08457 [gr-qc]. 24

[49] M. Pürrer, Phys. Rev. D 93, 064041 (2016), arXiv:1512.02248 [68] LIGO Scientific Collaboration, Virgo Col- [gr-qc]. laboration, “BayesWave software,” (2021), [50] A. Taracchini, Y. Pan, A. Buonanno, E. Barausse, M. Boyle, https://git.ligo.org/lscsoft/bayeswave. T. Chu, G. Lovelace, H. P. Pfeiffer, and M. A. Scheel, Phys. Rev. [69] LIGO Scientific Collaboration, Virgo Collaboration, “O2 C02 D 86, 024011 (2012), arXiv:1202.0790 [gr-qc]. Calibration Uncertainty Results Review/Update/Summary,” [51] T. Damour, Phys. Rev. D 64, 124013 (2001). (2018), https://dcc.ligo.org/LIGO-G1800319/public. [52] P. Schmidt, M. Hannam, S. Husa, and P. Ajith, Phys. Rev. D 84, [70] E. Thrane and C. Talbot, Publications of the Astronomical Soci- 024046 (2011), arXiv:1012.2879 [gr-qc]. ety of Australia 36, e010 (2019). [53] P. Schmidt, M. Hannam, and S. Husa, Phys. Rev. D 86, 104063 [71] M. Mateu-Lucena, S. Husa, M. Colleoni, H. Estellés, C. García- (2012), arXiv:1207.3088 [gr-qc]. Quirós, D. Keitel, M. L. Planas, and A. Ramos-Buades, “Data [54] L. Blanchet, A. Buonanno, and G. Faye, Phys. Rev. D 84, 064041 release for the paper "adding harmonics to the interpretation of (2011), arXiv:1104.5659 [gr-qc]. the black hole mergers of gwtc-1",” (2021). [55] S. Marsat, L. Blanchet, A. Bohe, and G. Faye (2013) [72] A. H. Nitz and C. D. Capano, Astrophys. J. Lett. 907, L9 (2021), arXiv:1312.5375 [gr-qc]. arXiv:2010.12558 [astro-ph.HE]. [56] K. Chatziioannou, A. Klein, N. Yunes, and N. Cornish, Phys. [73] A. B. Yoo, M. A. Jette, and M. Grondona, in Job Schedul- Rev. D 88, 063011 (2013), arXiv:1307.4418 [gr-qc]. ing Strategies for Parallel Processing, edited by D. Feitelson, [57] R. O’Shaughnessy, L. London, J. Healy, and D. Shoemaker, L. Rudolph, and U. Schwiegelshohn (Springer Berlin Heidelberg, Phys. Rev. D 87, 044038 (2013). Berlin, Heidelberg, 2003) pp. 44–60. [58] S. Marsat and J. G. Baker, ArXiv e-prints (2018), [74] C. Hoy and V. Raymond, ArXiv e-prints (2020), arXiv:1806.10734 [gr-qc]. arXiv:2006.06639 [astro-ph.IM]. [59] H. Estellés, S. Husa, M. Colleoni, M. Mateu-Lucena, M. L. [75] B. P. Abbott et al. (LIGO Scientific, Virgo), Phys. Rev. Lett. 119, Planas, C. García-Quirós, D. Keitel, A. Ramos-Buades, A. Ku- 141101 (2017), arXiv:1709.09660 [gr-qc]. mar Mehta, A. Buonanno, and S. Ossokine, arXiv e-prints [76] B. P. Abbott et al. (LIGO Scientific, Virgo), Phys. Rev. Lett. 116, (2021), arXiv:2105.06360 [gr-qc]. 061102 (2016), arXiv:1602.03837 [gr-qc]. [60] L. Blanchet, Living Reviews in Relativity 9, 4 (2006). [77] R. Abbott et al. (LIGO Scientific, Virgo), arXiv e-prints (2020), [61] A. Buonanno, B. Iyer, E. Ochsner, Y. Pan, and arXiv:2010.14527 [gr-qc]. B. S. Sathyaprakash, Phys. Rev. D 80, 084043 (2009), [78] S. E. Woosley, Astrophys. J. 836, 244 (2017), arXiv:1608.08939 arXiv:0907.0700 [gr-qc]. [astro-ph.HE]. [62] S. Vinciguerra, J. Veitch, and I. Mandel, Class. Quant. Grav. 34, [79] S. E. Woosley and A. Heger, arXiv e-prints (2021), 115006 (2017), arXiv:1703.02062 [gr-qc]. arXiv:2103.07933 [astro-ph.SR]. [63] M. Vallisneri, J. Kanner, R. Williams, A. Weinstein, and [80] M. Colleoni, M. Mateu-Lucena, H. Estellés, C. García-Quirós, B. Stephens, Proceedings, 10th International LISA Symposium: D. Keitel, G. Pratten, A. Ramos-Buades, and S. Husa, Phys. Rev. Gainesville, Florida, USA, May 18-23, 2014, J. Phys. Conf. Ser. D 103, 024029 (2021), arXiv:2010.05830 [gr-qc]. 610, 012021 (2015), arXiv:1410.4839 [gr-qc]. [81] R. Abbott et al. (LIGO Scientific, Virgo), Phys. Rev. D 102, [64] LIGO Scientific Collaboration, Virgo Collaboration, “Grav- 043015 (2020), arXiv:2004.08342 [astro-ph.HE]. itational Wave Open Science Center,” https://www.gw- [82] R. Abbott et al. (LIGO Scientific, Virgo), Phys. Rev. Lett. 125, openscience.org (2019). 101102 (2020), arXiv:2009.01075 [gr-qc]. [65] LIGO Scientific Collaboration, Virgo Collaboration, “Power [83] H. S. Chia, Private Communication (2021). Spectral Densities (PSD) release for GWTC-1,” (2019), [84] T. Venumadhav, B. Zackay, J. Roulet, L. Dai, and M. Zaldar- https://dcc.ligo.org/LIGO-P1900011/public. riaga, Phys. Rev. D 100, 023011 (2019), arXiv:1902.10341 [astro- [66] LIGO Scientific Collaboration, Virgo Collaboration, “Cali- ph.IM]. bration uncertainty envelope release for GWTC-1,” (2019), [85] LIGO Scientific Collaboration, “LIGO Algorithm Library - LAL- https://dcc.ligo.org/LIGO-P1900040/public. Suite,” free software (GPL), https://doi.org/10.7935/ [67] N. J. Cornish and T. B. Littenberg, Class. Quant. Grav 32 (2015), GT1W-FZ16 (2020). 10.1088/0264-9381/32/13/135012, arXiv:1410.3835.