<<

Inference with finite time series: Observing the gravitational Universe through windows

Colm Talbot,1, 2, 3,a Eric Thrane,4, 5 Sylvia Biscoveanu,2 and Rory Smith4, 5 1LIGO Laboratory, California Institute of Technology, Pasadena, CA 91125, USA 2LIGO Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA 3Kavli Institute for Astrophysics and Space Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA 4School of Physics and Astronomy, Monash University, VIC 3800, Australia 5OzGrav: The ARC Centre of Excellence for Gravitational-Wave Discovery, Clayton, VIC 3800, Australia Time series analysis is ubiquitous in many fields of science including gravitational-wave astronomy, where strain time series are analyzed to infer the nature of gravitational-wave sources, e.g., black holes and neutron stars. It is common in gravitational-wave transient studies to apply a tapered to reduce the effects of spectral artifacts from the sharp edges of data segments. We show that the conventional analysis of tapered data fails to take into account covariance between frequency bins, which arises for all finite time series—no matter the choice of window function. We discuss the origin of this covariance and derive a framework that models the correlation induced by the window function. We demonstrate this solution using both simulated Gaussian noise and real Advanced LIGO/Advanced Virgo data. We show that the effect of these correlations is similar in scale to widely studied systematic errors, e.g., uncertainty in detector calibration and power spectral density estimation.

I. INTRODUCTION for the correlations between neighboring frequencies in- troduced by the window function applied to obtain finite Time-series analysis underpins recent advances in time series. We show how correlations between frequency gravitational-wave astronomy. The vast majority of bins arise from the fact that quasi-stationary Gaussian gravitational-wave data analysis relies on windowing, a noise processes are fundamentally described in the fre- procedure that multiplies the time-domain data segment quency domain by continuous functions, which imply by a window function that tapers off at the beginning infinite-duration time series. We derive a simple expres- and end of the segment. Analysts apply tapered win- sion for the “finite-duration” covariance matrix, which dows to mitigate two effects: (1) spectral artifacts aris- encodes the correlations naturally present in all finite ing from the of the data segment edges time series and identify our result as a specific basis for (Gibbs phenomena) and (2) correlations between neigh- a Karhunen-Lo`eve transformation (KLT) (see, e.g., [11]). boring frequency bins. While correlations between neigh- We show that there are practical applications where the boring frequency bins can be reduced, they are never current conventional approach of windowing data incurs eliminated. systematic errors, which though small, produce invalid Choosing a suitable window requires balancing vari- inferences when data are combined in large sets or when ous considerations including the spectral leakage from we analyze gravitational-wave events with high signal-to- instrumental lines and the low-frequency “seismic wall,” noise ratio (SNR). effectiveness mitigating the Gibbs phenomenon, and the loss of signal. For a systematic study of the properties of different windows, we refer the reader to, e.g., [1,2] for theoretical introductions, and [3] for a specific discus- The remainder of this paper is organized as follows. sion in the context of gravitational-wave data analysis. In SectionII, we present the formalism underlying our Once the data are windowed, they are typically analyzed framework. We derive the finite-duration covariance ma- in the frequency domain where the noise is described by trix for the analysis of finite time series. In SectionIII, arXiv:2106.13785v2 [astro-ph.IM] 29 Sep 2021 a power spectral density (PSD), and it is assumed that we perform a demonstration, applying our method to each frequency bin is statistically independent. However, the binary merger events, GW150914 [12], this assumption is not true for a finite stretch of a longer GW170814 [13], and GW190521 [14] and contrast with noise process. results neglecting covariances. We show how the cur- The assumptions underpinning the Whittle approxi- rent conventional windowing procedure can lead to faulty mation have been thoroughly studied and many refine- inferences when many gravitational-wave measurements ments have been proposed (e.g., [4–10]). In this paper, are combined. While our demonstration uses data from we derive from first principles a formalism which accounts gravitational-wave astronomy, the framework we put for- ward is broadly applicable to all time-domain analysis. We show that the problem is fixed by using the finite- duration covariance matrix. We provide closing thoughts a colm.talbot@.org in SectionIV. 2

II. FORMALISM We note that, even if the true covariance matrix is di- agonal, a naive empirical estimate of this quantity neces- In this section, we derive a likelihood that enables sarily has some statistical uncertainty and will not generi- us to analyze time-series data characterized by station- cally be diagonal, even for a stationary Gaussian process. ary, Gaussian noise in a way that correctly takes into We emphasize that in this section we seek to derive ex- account covariance between neighboring frequency bins pressions for the true noise covariance matrix assuming µν is known. and neglect the impact of empirical esti- that arises for all finite time series. C mates. In AppendixB, we describe how we estimate µν in practice. C A. Basic notation

We consider time-series data d(t) consisting of signal s(t) and noise n(t) C. Non-continuous, finite-duration noise

d(t) = s(t) + n(t). (1) In practice, we only consider finite stretches of data. In In gravitational-wave observatories like LIGO [15] and this subsection, we derive the properties of finite stretches Virgo [16], n(t) is a time series of dimensionless of continuous noise. Let us consider data measured with strain (change in length per unit length). Transient sampling rate fs over data segment duration T . There gravitational-wave signals from merging binaries are are characterized by comparing the data to gravitational waveform templates h(t). N = fsT (5)

B. Continuous, infinite-duration noise independent measurements. We assume that the noise has no content above half the sampling rate and so we To start, we focus on noise in the absence of signals. can probe every frequency without aliasing. In practice, The noise can be expressed in the frequency domain as applying an aggressive low-pass filter removes this high- frequency content. Z ∞ n˜(f) = dt e−2πiftn(t). (2) These data can be represented either in the time- −∞ domain as a real N-component time series with spacing 1/fs or in the frequency domain as a complex frequency The noise is best described as continuous because it is series d˜i with fs/2 f fs/2 with spacing 1/T where defined for an arbitrary choice of frequency: with a suf- the endpoints− and zero-frequency≤ ≤ component are required ficiently long measurement, it is possible in principle to to be real. These two domains are related via the discrete achieve sufficient resolution to measuren ˜(f) for any value Fourier transform [17] of f. If we assume the noise is Gaussian, the likelihood of N−1 observing a specific noise realization is characterized by 1 X d˜0 = d0 e−2πijk/N . (6) a covariance matrix k f j s j=0 1 ∗ µν = n˜(fµ)˜n (fν ) (3) C 2 h i The frequency-domain covariance matrix for finite- the diagonal of which is equal to the noise PSD duration, non-continuous noise is: S

µ = diag ( µν ) . (4) 1 D 0 ∗0E S C Cij = d˜ d˜ . (7) 2 i j We refer to µν as the “infinite-duration” covariance matrix. It is definedC continuously for arbitrary values Here, the angled brackets denote ensemble averages over of fµ and fν and, in the time domain, it is defined for all times: ( , + ). Throughout, repeated indices are noise realizations. The widely-used Whittle approxima- summed over−∞ unless∞ otherwise specified. In the next sub- tion assumes that data at each of the analyzed frequen- cies are independent, i.e., C is a diagonal matrix. This is section, we contrast µν (calligraphic script and greek ij indices) with the finite-durationC covariance matrix, de- generally a good approximation. However, as we show in this paper, the assumption of independence is not strictly noted Cij (no calligraphic script and roman indices), valid when analyzing a finite stretch of data, especially which is defined only for discrete frequency bins fi and when using a tapered window. fj (or equivalently, for a finite duration). If we further assume that the noise is stationary (the PSD does not We begin by defining our window function w, which vary in time), µν is diagonal. describes how we measure some segment of noise from C 3

what is, in theory, an infinite-duration noise process: 42 (a) − ∞ ˜0 1 X −2πiψk/N dk = dψwψe (8) 520 44 fs − ψ=−∞ N 1 X −2πijk/N 46 = djwje (9) 510 − | fs ij C

j=0 | ˜ 48 10

= (d w˜)k. (10) log ∗ 500 − Frequency [Hz] w Here, j is a time-domain window function and the 50 − frequency-domain noise is now the convolution of the 490 original frequency-domain noise with the Fourier trans- formed window function. The prime denotes quantities 52 − associated with the windowed data. 480 We stress that this window function is always present 480 490 500 510 520 Frequency [Hz] in gravitational-wave data analysis problems and is de- 42 fined for all times, not just the analysis segment. It is (b) − often ignored when it is a top hat function, i.e., ( 520 44 1 0 ψ < N − wψ = ≤ . (11) 0 else 46

510 − | ij C The most commonly used window function in parameter | estimation for compact binary coalescences is the Tukey 48 10 500 − log window Frequency [Hz]

 h  i 50 1 2πψ αN −  2 1 cos αN 0 < ψ < 2  − 490 1 αN ψ N αN w α 2 2 . ψ( ) = h  2π(N−ψ) i ≤ ≤ − 52  1 1 cos N αN < ψ < N −  2 − αN − 2 480  480 490 500 510 520 0 else Frequency [Hz] (12) Common limiting cases of the Tukey window are the rect- angular window (α = 0) and the Hann window (α = 1). FIG. 1. The estimated (top) and analytic (bottom) co- Throughout this paper, we use Tukey windows with variance matrix for simulated noise using a noise power spec- α = 0.1 unless otherwise specified, although the formal- tral density estimated from LIGO Livingston data around the time of the merger GW170814. The color ism described here holds for arbitrary window functions. −1 bar indicates log10 power spectral density in units of Hz . Since convolution is a linear operation, we can express The estimate is obtained using approximately three months the windowed frequency-domain data using standard lin- of simulated data. The off-diagonal behavior agrees well with ear algebra notation the analytic expression. The window used is a Tukey window with α = 0.1. ˜0 ˜ ˜ dk = Wkµdµ, (13) ing Eq.7 and Eq. 13 where repeated indices denote summation. Here, W˜ kµ is a non-square subset of the circulant matrix 1 D ∗ ∗E Cij = W˜ iµW˜ d˜µd˜ (15) 2 jν ν ˜ Wµν =w ˜µ−ν (14) 1 ∗ D ∗E = W˜ iµW˜ d˜µd˜ 2 jν ν that projects infinite-duration noise with frequency res- ∗ = W˜ iµW˜ µν . (16) olution δf 0 to finite duration data with frequency jν C → resolution 1/T . Here,w ˜ is the discrete Fourier transform If the underlying data are Gaussian and stationary, µν of the time-domain window function. is diagonal and the finite-duration covariance matrixC We can now write the covariance matrix for a finite- only depends on the window function and the infinite- duration data stream with an arbitrary window function duration PSD. While we initially defined the roman in- in terms of the frequency-domain covariance matrix of dices as covering frequencies from [ fs/2, fs/2], in prac- the infinite-duration process and the window function us- tice we analyze a narrower (positive)− frequency range 4 from [fmin, fmax]. In this paper we will set fmin = 20 components of the covariance matrix is clearly seen using Hz, fmax = 800 Hz; this omits data that are affected by the following metric: the bandpass filter we apply. From here, roman indices Cij will refer to this frequency range only. ∆i max . (17) i6=j S Formally, one must carry out matrix products over the ≡ j infinite axes denoted by greek indices to obtain the finite- We consider two regimes that determine the limiting be- duration covariance matrix in Eq. 16. In practice, how- haviors of ∆i: (i) where the power spectral density is ever, via numerical experiment (see AppendixB) we find slowly varying (locally white noise) and (ii) near a large that infinite-duration matrices can be adequately approx- spectral feature. imated using a frequency resolution 16 times that of the We first consider the case of white noise, i.e., µν = C analysis segment. In other words, when analyzing a 4 s δµν . We write the finite-duration covariance matrix data segment (frequency resolution = 0.25 Hz), we may S ∗ Cij = W˜ iµW˜ . (18) model infinite-duration noise with a 1/64 Hz (or higher- S jµ resolution) noise model. In practice, we use a 1/128 Hz and resolution. The resolution of the noise model can be ˜ ˜ ∗ P ∗ tuned to achieve the necessary precision for a given prob- WiµWjµ µ w˜i−µw˜j−µ lem. ∆i = max = max . (19) i6=j ˜ ˜ ∗ i6=j P ∗ In Fig.1 we show the empirical finite-duration co- WiµWiµ µ w˜i−µw˜i−µ variance matrix in the neighborhood of the diagonal— ˜ ˜ ∗ obtained by averaging approximately three months of We note that the quantity WiµWjµ is real and its am- 6 plitude monotonically decreases as i j increases. The simulated Gaussian noise broken into 10 4 s segments | − | and using a Tukey window with α = 0.1 (Eq.7) contamination is therefore maximized when i and j are (top panel)—compared with the exact finite-duration neighboring finite-duration frequency bins (we denote this as j = i± and emphasize that i± = i 1 due to covariance matrix obtained with our analytic expres- 6 ± sions (Eq. 16) (bottom panel). The underlying PSD the omitted interstitial frequencies in the finite-duration analysis). We find ( µ) is estimated using data from the LIGO Livingston S interferometer at the time of the binary black hole P ∗ w˜i−µw˜ merger GW170814 with a resolution of 1/128 Hz using µ i±−µ ∆i = . (20) the method described in AppendixB. The two panels P ∗ w˜i−µw˜ agree well, demonstrating the correctness of this formal- µ i−µ ism. However, the averaging estimate converges unac- The variable ∆i is a monotonically increasing function α=0 ceptably slowly for practical use, and so our analytic ex- of the Tukey parameter α, with ∆i = 0 for a rectan- α=1 pression is essential for practical applications. gular window and ∆i = 2/3 for a Hann window (see We are interested in the inverse of the covariance ma- AppendixE for a derivation). While the power spectrum −1 trix Cij . We discuss the issue of inverting this matrix of gravitational-wave detectors is not white, it is slowly in Sec.IIE. We also emphasize that the finite-duration varying away from the large spectral features and so we PSD (the leading diagonal of Cij) Si = i, i.e., the expect this approximation to hold throughout much of 6 S finite-duration PSD is not the infinite-duration evaluated the observing band. at the desired frequencies. Additional technical details The other limiting case we can analytically describe is about this formalism are provided in the Appendix. In the behavior near a spectral line with relative amplitude

AppendixA, we discuss how our formalism is related L at frequency fµl with an otherwise white spectrum. In to “coarse-graining” procedures used for PSD estimation this case we can approximate (e.g., [18]). In AppendixC and Algorithm1, we describe  µ ν µ in detail how we approximate µν and Cij from real data.  = = l C S 6 µν = L µ = ν = µl . (21) C S 0 µ = ν D. Quantifying spectral leakage 6 We can now write the row corresponding to the line in the finite-duration covariance matrix: From Fig.1, we see that windowing produces off- ˜ ˜ ∗ diagonal elements in the finite-duration covariance ma- Ciµl = WiµWµ ν µν (22) l C trix in general. In this subsection, we illustrate how ( ∗ w˜i−µw˜ µ = µl this covariance is compounded by sharp spectral features. = µl−µ 6 (23) Lw w∗ µ µ The relationship between the choice of window and spec- S ˜i−µl ˜0 = l tral leakage from sharp spectral features is well known   in signal processing; see, e.g., [1]. In gravitational-wave X ∗ ∗ =  w˜i−µw˜µ −µ + Lw˜i−µl w˜0 . (24) detectors, there are a number of such features commonly S l µ6=µl referred to as “lines” [19] or the low-frequency seismic ∗ L w˜i−µ w˜ (25) wall. The leading-order correction from the off-diagonal ≈ S l 0 5

(a) 102

101 i ∆ 100

1 10−

100 200 300 400 500 600 700 800 Frequency [Hz]

(b) 102

101 i ∆ 100

1 10−

100 200 300 400 500 600 700 800 Frequency [Hz]

(c) 102 V1

101 i ∆ 100

1 10−

100 200 300 400 500 600 700 800 Frequency [Hz]

FIG. 2. Maximum contamination per frequency bin (Eq. 17) for PSDs estimated near the time of binary black hole merger GW170814 [13]. We see two competing effects. First, there is broadband contamination at ∼ 10% when the PSD is slowly varying. The magnitude of this contamination rises with increasing Tukey α (see AppendixE). There is also large contamination near instrumental lines due to spectral leakage that is suppressed with increasing Tukey α.

In the last line we assume Lw˜i−µl 1 and so the con- used in our analysis of the gravitational-wave signal, tamination will fall off with the same spectral shape as GW170814 [13] (Sec.IIIA), which was observed in 2017 the window function: by the Advanced LIGO [15] and Virgo [16] observato- ries. For the two LIGO observatories [15], the magni- ∗ ∆i L w˜i−µl w˜0 w˜i−µl . (26) tude of the off-diagonal terms is approximately constant ≈ | | ∝ | | throughout the observing band with exceptions for the We emphasize at this stage that µl is not necessarily (and known lines. The “violin modes” for the Livingston in- in fact almost guaranteed to not be) contained in the set terferometer ( 500 Hz) are significantly larger than for of roman indices; i.e., the line is not exactly a delta func- the Hanford interferometer∼ and the contamination near tion at one of the 1/T Hz spaced frequency bins. If the the lines is larger and more broadband. Given this be- line were located at one of these frequencies, a rectangu- havior, one might think that we can neglect the impact lar window would have zero spectral leakage and be the of the off-diagonal terms if we remove from the analy- optimal choice. However, when this is not the case, the sis the frequency bins in the neighborhood of the lines. rectangular window maximizes the contamination from However, for the Virgo observatory [16], we see that the lines. In practice, sharp spectral features in interferome- average correction is much larger across most of the band ter power spectra have finite width and therefore a rect- and is frequently > 20%. This can be attributed to the angular window will never generically avoid leakage from Virgo PSD being less smoothly varying. A cut based on lines. frequencies with unacceptably large contamination would In Fig.2, we show this quantity for the PSDs 6

0

41 H1 10− L1 2 500 − V1 43 min(PSD) Truncation 4 10− 1000 − Theoretical Truncation

6 | ij Eigenvalue − U 45 1500 |

10− 10 8 − log

Eigenmode number 2000 10 47 10 − 0 500 1000 1500 2000 2500 3000 − Eigenmode number 2500 12 −

3000 14 FIG. 3. Eigenvalues of an estimated noise power covari- − 20 145 270 395 520 645 770 ance matrix for data close to GW170814 for a Tukey window Frequency [Hz] with α = 0.1. The large eigenvalues correspond to spectral lines and the low-frequency seismic wall; the slowly varying region corresponds to the smoothly varying observing band. FIG. 4. Eigenmodes for the covariance matrix estimated The dashed vertical line denotes the points after which we from data from the LIGO Livingston interferometer close to discard the eigenmodes as determined by the window infor- GW170814. This matrix encodes the correlations between mation loss. The dotted lines indicate the point at which the physical frequencies. The horizontal axis corresponds to the eigenvalue drops below the minimum of the finite-duration physical frequencies, while the vertical axis is the order of PSD. Both of these well-approximate the turnover after which decreasing eigenvalue. The bottom-most eigenmodes are the the eigenvalues rapidly decline due to information loss from ones that are discarded. These eigenmodes are associated windowing. The difference is most pronounced for the Virgo with very broadband frequency content. data, which is consistent with the increased contamination between frequency bins (see Figure 14). where a fraction  of the N eigenmodes are retained. The regularized matrix and its inverse are remove most of the observing band. However, data anal- ¯ ¯ −1 ysis with the finite-duration covariance matrix can be Cij =UikΛklUlj (29) used to take into account covariance in Virgo noise. ¯−1 ¯ −1 −1 Cij =UikΛkl Ulj . (30) In Fig.3, we show the eigenvalues in decreasing or- E. Regularized inversion der for the covariance matrix estimated at the time of GW170814 and the corresponding eigenmode spectrum is shown in Fig.4. We identify three regimes in the eigen- Having characterized the covariance between frequency value spectrum: bins due to windows, we turn to the inversion of the co- variance matrix required to evaluate the likelihood func- 1. Large eigenvalues with a steep slope at low eigen- tion. Since tapered window functions go to zero at the mode number. These predominantly correspond to edges by construction, the covariance matrix is not in- frequencies where Cij is large, e.g., near spectral vertible [20]. To deal with this issue, we perform a lines and at low frequencies. regularized inversion using a singular value decomposi- tion (SVD) and discard the smallest eigenvalues. The 2. A slowly varying region encompassing the majority SVD of a Hermitian matrix can be written as of the eigenvalues. This corresponds to the remain- der of the frequencies where the PSD is smoothly −1 varying. Cij = UikΛklUlj . (27) 3. A rapid drop to the smallest eigenvalues. This is Here, Λkl = δklλk (no summation) is a diagonal matrix a characteristic feature of ill-conditioned matrices with the eigenvalues λk on the leading diagonal. Reg- and is the region we should remove when regular- ularization simply involves removing eigenmodes corre- izing. sponding to small eigenvalues. In practice, this is done by setting the eigenvalue to : We consider two methods to determine how many ∞ eigenvalues to discard. In the first method we consider ( the power loss from the window function. The amount λi i N λ¯i = ≤ (28) of information lost by the window is related to the time- i > N averaged square of the window function. The effective ∞ 7 number of independent time samples is We note that the KLT is closely related to the SVD and therefore identify that the basis for the KLT of a Z N−1 finite subset of a longer Gaussian process is the basis 2 2 X 2 Neff = Nw = N dt w (t) w . (31) ≈ i found in SectionIIE. This provides a second, equivalent, i=0 interpretation of the inner product in Eq. 34: We choose the fraction of eigenvalues to retain based on 2 −1 −1 X x¯i x,˜ x˜ ¯ =x ˜iUikΛ¯ U = | | , (35) the window function: h iC kl lj λ¯ i i Neff 2  = = w . (32) where we have definedx ¯i x˜iUik. The likelihood is now N ≡  2  X d¯i h¯i The vertical dashed line is at Neff and shows the number ln = | − | + ln (λi) + const. (36) L − λ of modes we omit to account for the information loss due i i to the tapered window; we note that this matches well the transition to the badly behaved modes. For the sec- We identify that this has the usual form of the likeli- ond method we set a threshold value corresponding to the hood except that all of the quantities are described in minimum of the power spectral density over the analyzed the eigenbasis of the KLT, rather than the Fourier basis. frequency band λT = minµ( µ). We show this threshold in the dotted line in Figure3S. The two threshold values III. DEMONSTRATION agree well for small Tukey α although the former method systematically removes more eigenmodes. We find that the choice of regularization scheme has a negligible im- To demonstrate our formalism we perform two tests. pact on the results of our analysis. First, we analyze three binary black hole mergers to demonstrate that the effect of the off-diagonal correc- tions is small but noticeable for confidently detected sig- F. Finite-duration likelihood nals. Second, we demonstrate that, although this effect produces a minor correction to modest-SNR events, ne- We can now write our final result, the regularized like- glecting the impact of off-diagonal terms in the noise co- lihood with a finite-duration covariance matrix: variance matrix biases precision estimates, such as evi- dence calculations required for searches for a population 2  2 D E  of weak, sub-threshold signals as in [22]. ¯(d˜θ, C¯) = exp d˜ h,˜ d˜ h˜ , (33) L | T det C¯ −T − − C¯ where det C¯ is the determinant of the finite-duration A. Single events noise covariance matrix, h˜ is a template for the signal, and the inner product is defined as We compare the posterior distributions obtained using both the conventional and finite-duration covariance ma- ¯−1 ∗ x,˜ x˜ C¯ =x ˜iCij x˜j . (34) trices for three of the observed binary black hole mergers h i GW150914 [12], GW170814 [13], and GW190521 [14]. As the exponent in the likelihood can still be written as We choose these as they are relatively high-mass systems weighted inner products between data and template, we (with detector frame primary and secondary masses of can analytically marginalize over extrinsic parameters in m1 m2 30 40M for GW150914 and GW170814 ≈ ≈ − the same way as for the diagonal likelihood [21]. and m1 m2 150M for GW190521) for which we ex- pect the≈ tapered≈ window to have a comparatively large impact on the data as they use a relatively short stretch G. Relation to the Karhunen-Lo`eve transform of analysis data. They also span a large range in time, with one event from each of the first three observing runs The Karhunen-Lo`eve theorem states that for any of the advanced detector network, leading to significantly stochastic process there exists a basis in which the noise different PSDs. covariance matrix is diagonal, and the transformation For each event, we analyze 4 s of data centered at the into this basis is referred to as the KLT. For a colored sta- trigger time for the event and estimate the PSD using tionary Gaussian process that is periodic with period T , a 512 s stretch of data ending 2 s before the trigger. We this basis is the discrete Fourier transform with spacing apply a bandpass filter between 16 and 1024 Hz and re- 1/T Hz. The Whittle likelihood approximation specifi- sample the data to a new Nyquist frequency of 2048 Hz cally assumes that this basis also diagonalizes the covari- using GWpy [23] to mitigate spectral leakage from low and ance matrix for a subset of a longer Gaussian process that high frequencies. We use a Tukey window with α = 0.1 is not periodic with period T . As we have demonstrated, for the analysis segment. The details of the covariance the covariance matrix in the Fourier basis is not diagonal matrix calculation are described in Algorithm1. We em- in this case. ploy the IMRPhenomXPHM waveform approximant [24–26] 8

Finite Finite Duration Duration Diagonal Diagonal

35 ] ]

30 25 M M [ [ 2 2 m m 25 20

30 35 40 25 30 35 30 40 20 25

m1 [M ] m2 [M ] m1 [M ] m2 [M ]

FIG. 5. Intrinsic parameters for the binary black hole merger FIG. 6. Intrinsic parameters for the binary black hole merger GW150914 [12] using our new finite-duration likelihood (blue) GW170814 [13] using our new finite-duration likelihood (blue) and the diagonal likelihood (orange). The primary and sec- and the diagonal likelihood (orange). The primary and sec- ondary mass (m1, m2) refer, respectively, to the more-massive ondary mass (m1, m2) refer, respectively, to the more-massive and less-massive component masses. Including covariance be- and less-massive component masses. Including covariance be- tween neighboring bins has no observable impact on the in- tween neighboring frequency bins slightly shifts the inferred ferred posterior. posterior distribution for the mass ratio.

in the frequency range 20 800 Hz; for GW190521 we B. Combining data segments set the upper frequency limit− as 300 Hz. We neglect the impact of calibration uncertainty or uncertainty in our estimate of the PSD. For GW150914, we analyze data By combining large numbers of time-series data seg- from the two LIGO interferometers, for the other two ments it is sometimes possible to extract weak signals not events, we analyze data from the two LIGO interferome- visible in individual segments, for example, to measure ters and Advanced Virgo. the population of gravitational waves from unresolved compact binaries [22, 33–35] and to detect gravitational- We show the posterior distribution for two of the in- wave memory [36]. Combining data segments can also trinsic binary parameters when assuming Cij is diagonal have the effect of magnifying systematic errors that are (blue) and using the full covariance matrix (orange) for small enough to ignore when considering just a single seg- GW150194, GW170814, and GW190521 in Figs.5,6, ment in isolation. For example, failing to take into ac- and7 respectively. We assume the same prior for both count uncertainty in estimates of the noise PSD leads to analyses. The primary and secondary mass refer, respec- low-level excess power, which can be mistaken for a popu- tively, to the more-massive and less-massive black-hole lation of sub-threshold gravitational-wave signals [9, 29]. masses. The largest difference we see is in the compo- Here, we show that the correlations between neighbor- nent masses for GW170814, primarily driven by a change ing frequency bins induced by all windows must be taken in the inferred mass ratio. There is no visible difference into account to avoid systematic error in studies that rely between the posteriors for the other events. The change on precision measurements combining many segments. in the posterior distributions is at a similar level to the We illustrate this point using simulated data to carry changes due to marginalizing over uncertainty in the de- out a mock search for a population of sub-threshold sim- tector calibration [27, 28] or PSD estimate [9, 29, 30], ulated signals in simulated Gaussian noise with a known but less than the difference due to using different PSD PSD. We employ the formalism from [22] to estimate the estimation methods [9]. Errors of this magnitude be- fraction of M = 160000 data segments of which 15000 come important when we combine many events together contain a simulated signal. for population studies and/or precision tests of (see, e.g., [31, 32]). Our likelihood is a mixture model, which allows for 9

20 ] 10 2 − / 1 (a) − 21 Finite 10− Duration 10 22 Diagonal − 23 10−

24 10−

25 10−

Amplitude Spectral10 Density [Hz 26 − 20 30 40 50 60 70 80 90 100 ]

75 Frequency [Hz] M [ 20 ] 2 10 2 − / 1 m (b) − 21 50 10−

22 10−

60 80 50 75 23 100 10− m1 [M ] m2 [M ] 24 10−

25 FIG. 7. Intrinsic parameters for the binary black hole merger 10− GW190521 [14] using our new finite-duration likelihood (blue)

Amplitude Spectral10 Density [Hz 26 and the diagonal likelihood (orange). The primary and sec- − 100 200 300 400 500 600 700 800 ondary mass (m1, m2) refer, respectively, to the more-massive Frequency [Hz] and less-massive component masses. Including covariance be- tween neighboring bins has no observable impact on the in- ferred posterior. FIG. 8. The signals considered for our population test. The four signals we consider are: a Gaussian burst cen- tered at 50 Hz and standard deviation 10 Hz (blue), a Gaus- sian burst centered at 500 Hz and standard deviation 10 Hz each segment to consist of either signal or noise : S N (orange), a 60M binary black hole merger (green), and a 300M binary black hole merger (red). In purple, we show M Y h i the finite-duration amplitude spectral density for the LIGO ( d ξ) = ξ (di ) + (1 ξ) (di ) . (37) L { }| L |S − L |N Livingston interferometer at the time of binary black hole i merger GW170814. The Gaussian bursts isolate the impact of specific spectral features, e.g., the large lines around 500 Hz. Here, (di ) is the likelihood of data segment i given the L |S The binary black hole mergers are broadband and accumulate signal hypothesis while (di ) is the likelihood given most of their signal-to-noise ratio at low frequencies, near the L |N the noise hypothesis and the parameter ξ is the fraction sharp rise due to seismic noise. of segments that contain a signal. We simulate data in 128 s chunks and break up the data into 4 s segments. We compute the finite-duration PSD matrix using the the higher-frequency burst overlaps with the largest spec- known PSD used to simulate the data. The known PSD tral lines. is as estimated for the LIGO Livingston detector in our The second two are representative of the gravitational- analysis of GW170814. We do not re-estimate the PSD wave signals observed so far and considered in the search from the simulated data in order to avoid uncertainty proposed in [22]. These signals are relatively broadband intrinsic to the PSD estimation method. in frequency with well-defined frequency evolution. The For the simple example considered here we assume 60M waveform is chosen to be representative of the that the signal in each segment containing a signal is most commonly observed systems and the 300M wave- the same [37]. We consider four choices of signal: (i) form is chosen based on the largest observed system [38]. a Gaussian burst centered at 50 Hz and standard devia- As in the previous section, for the binary black hole wave- tion 10 Hz, (ii) a Gaussian burst centered at 500 Hz and forms we use the IMRPhenomXPHM waveforms. In Figure8, standard deviation 10 Hz, (iii) a 60M total mass binary we show the amplitude spectral density of each of the black hole merger waveform, and (iv) a 300M total mass signals along with the diagonal of the finite-duration co- binary black hole merger waveform. The first two are variance matrix (Si). two well-localized signals in frequency with random per- For each iteration, we calculate the posterior for ξ two frequency phases. The lower-frequency burst does not ways, once using the standard diagonal likelihood and significantly overlap with large spectral features, while once using the finite-duration likelihood. The results are 10

cumulate significant SNR at low frequencies where the 400 (a) Diagonal noise is dominated by leakage from seismic noise. This Finite Duration is somewhat mitigated by the application of a high-pass )

ξ filter to suppress content below 16 Hz. They also have ( True Value p 200 a characteristic phase evolution, which contributes sig- nificant resolving power to the likelihood function. This 0 means that the phase coherence between the neighboring frequencies is important to the evaluated likelihoods.

400 (b) IV. DISCUSSION ) ξ (

p 200 As gravitational-wave astronomy matures, the grow- ing catalog of events enables exciting new science. How- 0 ξ ever, as we probe increasingly higher SNRs, and as we combine larger ensembles of data segments, our analy- 400 (c) ses are increasingly susceptible to systematic error from approximations in our models and data analysis. Many )

ξ sources of systematic error in our modeling have been (

p 200 considered in recent years including calibration uncer- tainty [27, 28], waveform systematics [31, 39–41], and noise estimation [9, 29, 30]. In this paper, we examine 0 how correlations between frequency bins are inevitably introduced by windowing in time-domain analysis and 400 (d) find corrections at the same level as those due to cali- bration and PSD uncertainty. We show how these corre- )

ξ lations can be modeled using a frequency-domain noise (

p 200 covariance matrix, thereby avoiding bias. By performing a singular value decomposition, we identify that the basis 0 that diagonalizes the covariance matrix is not the Fourier 0.08 0.09 0.10 0.11 0.12 basis as usually assumed, but depends on both the PSD and the choice of window function. We demonstrate that, while the impact of the off- FIG. 9. The posterior distribution for the fraction of seg- ments containing a signal in our toy example (SectionIIIB). diagonal components in the noise covariance matrix is We analyze ∼ 7.5 days of simulated Gaussian noise divided small for individually resolved events, they are important into 4s segments (1.6 × 105 segments), and 15000 of these for precision estimates of the Bayesian evidence. These segments contain four different signals: (i) a Gaussian pulse precision estimates of the Bayesian evidence are crucial with standard deviation 10 Hz and central frequency 50 Hz when attempting to use Bayesian evidence estimates as (ii) a Gaussian pulse with central frequency 500 Hz, (iii) an a detection statistic, e.g., [22, 42]. equal-mass binary black hole merger signal with total mass A natural extension of the framework provided here is 60M and (iv) an equal-mass binary black hole merger signal to incorporate marginalization over sources of systematic 300M . The amplitude of the signals and the PSD are shown uncertainty. Marginalization over waveform or detector in Figure8. In blue we show the posterior distribution ob- calibration uncertainty can be trivially combined with tained using the standard likelihood that ignores correlations between neighboring frequency bins induced by the window this method. Marginalizing over uncertainty in the PSD function. In orange, we show the posterior distribution using estimate would require either modifying the BayesLine a likelihood that accounts for these correlations. We see that algorithm [43] to include modeling the full noise covari- the diagonal method gives a biased result when the signal is ance matrix or an analytic method such as in [6,9, 44]. centered at 500 Hz and when analyzing simulated binary black Even after considering all these forms of systematic un- hole systems. certainty, we still need to develop methods to deal with the non-Gaussianity and non-stationarity of real data. This is an active area of development [45–52]; however, shown in Fig.9 for a Tukey window with α = 0.1. For establishing a unified treatment is left to future studies. all considered signals, the diagonal method produces a While the analysis here has focused on analysis visibly biased result. When the Gaussian signal is close of short-duration transients, windows are also used to large spectral lines (b) the bias is most significant. in searches for longer duration gravitational-wave sig- The bias for the binary black hole signals (c-d) is larger nals [53, 54]. Typically, long-duration searches use much than for the low-frequency Gaussian (a). We attribute longer segment durations than those described here with this to two effects. The binary black hole signals ac- Hann windows to mitigate leakage from lines. As we 11 showed in this paper, this may lead to correlations be- method we use in this paper. tween neighboring frequency bins for these analyses. The coarse-grained PSD is defined for a frequency res- In searches for the stochastic gravitational-wave back- olution δf as ground, a coarse-grained PSD estimate is typically used (see SectionA) which may reduce the impact of these Z fi+δf/2 correlations. Additionally, window functions are used to 1 Si = df (f) (A1) excise short duration non-Gaussian “glitches” from anal- δf fi−δf/2 S ysis [46, 55–57] (typically this is referred to as “gating”). 1 Z ∞ These gates have very short rise times (small Tukey α) = dfΠ(fi δf/2, fi + δf/2) (f) (A2) δf − S and so introduce significant contamination around spec- −∞ tral lines. 1 2 = µ w˜µ−αi α Z (A3) αS | | ∈  1 α < µ < α  2 2 ACKNOWLEDGEMENTS − α α w˜µ = 0.5 µ = if Z (A4) | | 2 2 ∈ 0 else We are grateful to Sharan Banagiri, Katerina Chatzi- iaonnou, Joe Romano, and Alan Weinstein for many fruitful discussions and the anonymous referee for a In the second line, Π is the unit boxcar function. As thoughtful and detailed review. CT acknowledges the for the window operator in this paper, we can express support of the National Science Foundation, and the this as a circulant matrix with non zero entries only in a LIGO Laboratory. This work is supported through small region. The corresponding time-domain window is the Australian Research Council (ARC) Centre of Ex- the sinc function. We note that the coarsened frequency- cellence CE170100004. SB is also supported by the domain covariance matrix is diagonal by construction in Paul and Daisy Soros Fellowship for New Americans, this case. the Australian-American Fulbright Commission, and the NSF Graduate Research Fellowship under Grant No. DGE-1122374. This research has made use of data, soft- ware, and/or web tools obtained from the Gravitational Appendix B: Estimating the infinite-duration PSD Wave Open Science Center [58, 59](https://www.gw- openscience.org), a service of LIGO Laboratory, the LIGO Scientific Collaboration and the Virgo Collabora- Computing Cij using Equation 16 requires an expres- sion for the infinite-duration covariance matrix µν . In tion. LIGO is funded by the U.S. National Science Foun- C dation. Virgo is funded by the French Centre National this Appendix, we describe our method for estimating µν from the data. This represents stages 1 4 of Algo- de Recherche Scientifique (CNRS), the Italian Istituto C − Nazionale della Fisica Nucleare (INFN), and the Dutch rithm1. Nikhef, with contributions by Polish and Hungarian in- We assume the data are stationary and Gaussian and stitutes. The authors are grateful for computational re- therefore the only nonzero elements of µν are the leading C sources provided by the LIGO Lab and supported by diagonal µ. We approximate this infinite-duration PSD National Science Foundation Grants PHY-0757058 and using segmentsS longer than our final analysis segment. PHY-0823459. This is document LIGO-P2100090. In order to avoid long-term drift of µ in real interfer- S Cython and CUDA implementations and Python wrap- ometer data, we restrict ourselves to 512 s of data. We pers of the PSD matrix calculation are available at then subdivide this into non-overlapping segments with github.com/ColmTalbot/psd-covariance-matrices. We duration = D and compute a median average PSD using also provide example scripts demonstrating how to pro- a Hann window as our representation of µ. S duce the results in this paper and some supplementary To assess the convergence of this method, we compute results in the same location. Si using a range of values of D. In Fig. 10, we show the ratio of the finite-duration PSDs to the finite-duration PSD obtained when using D = 128 s (the longest dura- Appendix A: Connection to coarse-grained PSD tion we consider). The estimate quickly converges with estimation increasing D away from the spectral lines. Close to the forest of large lines around 500 Hz, the difference between While time-averaging of short segments to estimate the D = 64 s and D = 128 s estimates is 20%. We power spectral densities (e.g., Welch averaging) is com- therefore infer that D = 64 s is sufficiently≈ converged. mon in gravitational-wave data analysis, an alternative The key quantity for ensuring adequate convergence is “coarse-graining” method is used in some areas, espe- the ratio between D and the original segment duration. cially searches for the stochastic gravitational-wave back- In this case, µ has a resolution 16 as fine as Si. We ground; see, e.g., [18]. In this appendix, we demonstrate use this procedureS when analyzing real× data throughout that coarse graining is a special case of the projection this paper. 12

Algorithm 1: Computing the regularized inverse PSD matrix from real data Result: regularized inverse PSD matrix (Equation 30) 1. load 512 s of data ending 2 s before the analysis segment begins; 2. divide into 4 × 128 s chunks; 3. apply a Hann window (Tukey α = 1) to each chunk and FFT;

4. take a median average of power in each chunk to generate the “infinite”-duration PSD (Sµ);

5. define the infinite-duration window (wψ) as a 128 s time series according to Equation 12;

6. compute the finite-duration covariance matrix (Cij ) using Equation 16;

7. compute the SVD (Equation 27) and regularized inverse (C¯ij ) as outlined in SectionIIE;

In Algorithm1, we describe the process used to com- 3.0 D = 8s pute the regularized inverse covariance matrices used for our binary black hole analyses. We note that the method 2.5 D = 16s presented here requires an estimate for the true underly- s D = 32s 2.0 ing power spectral density. In practice, we estimate this

=128 D = 64s D

i from the data by taking a median average of the PSD in 1.5

/S several longer segments. i S 1.0

0.5 Appendix D: Additional figures 200 400 600 800 Frequency [Hz] In this Appendix we show the PSD matrix (top left), regularized PSD matrix (top right), SVD eigenmatrix (bottom left), and regularized inverse PSD matrix (bot- FIG. 10. The ratio of the inferred finite duration PSD (Si) with a 1/4 Hz resolution when using different longer segment tom right) for our analysis of GW170814 for LIGO Han- durations (D) to estimate the infinite-duration PSD (Sµ). ford (Fig. 11), LIGO Livingston (Fig. 12), and Virgo (Fig. 13). We note that the data for all three interferom- eters show the same qualitative features and quantitative Appendix C: Computing the finite-duration differences determined by the specific sensitivity of each covariance matrix interferometer. We see that the PSD matrix is dominated by the lead- For a typical 4 s data segment, with sampling fre- ing diagonal and nearby frequencies and correlations at quency 2048 Hz, analyzed with a 1/128 Hz noise model, frequencies corresponding to spectral lines. The correla- we must perform the matrix operations in Equation 16 tions from the spectral lines are less pronounced in the with (218 218) elements. A naive implementation at regularized PSD matrix, however, there is more broad- doubleO precision× would require a prohibitive amount of band correlation between frequencies above/below the computational resources. Fortunately, the computation most sensitive frequency. The divide between frequencies can be performed much more computationally efficiently. above and below the most sensitive frequency can also be seen in the SVD and regularized inverse PSD matrices. The first thing we note is that µν is diagonal and W˜ µν is a circulant matrix, i.e., it is fullyC specified by a single row/column (w ˜µ). Since each matrix can be represented using a single vector, we do not need to form any matri- Appendix E: Window overlaps ces with the 1/128 Hz resolution, dramatically reducing memory requirements. We also identify that C is a ij In order to quantify spectral leakage for white noise we Hermitian matrix, reducing the computational cost by a find Equation 20: factor of two. We provide functions to compute the coarsened PSD P ∗ matrix from a frequency-domain window and PSD imple- maxi6=j µ w˜i−µw˜j−µ mented in Cython and cupy compatible CUDA. The for- ∆i = . (E1) P w w∗ mer runs in (N 3) time and the latter in (N) wall µ ˜i−µ ˜i−µ time. The latterO is used to produce the resultsO in this pa- per and is less computationally expensive than the SVD The denominator in this expression is simply the total performed on the coarsened data. window power w2. In the continuum limit, the numerator 13

770 44 770 44 − − (a) (b) 46 − 46 645 645 − 48 − 48 − 520 520 50 − | 50 | ij ij

− ¯ C C | |

52 10 10 395 − 395

log 52 log − Frequency [Hz] 54 Frequency [Hz] − 270 270 54 − 56 − 145 145 56 58 − − 58 20 60 20 − 20 145 270 395 520 645 770 − 20 145 270 395 520 645 770 Frequency [Hz] Frequency [Hz]

0 770 46 (c) (d) 2 500 − 645 44 4 − 1000 42 520

6 | | 1

− ij − ij ¯ U C |

1500 40 | 10 8 395 10 log

− log Frequency [Hz]

Eigenmode number 2000 38 10 270 −

36 2500 12 − 145

14 34 3000 − 20 20 145 270 395 520 645 770 20 145 270 395 520 645 770 Frequency [Hz] Frequency [Hz]

FIG. 11. PSD matrix (top left), regularized PSD matrix (top right), SVD eigenmatrix (bottom left), and regularized inverse PSD matrix (bottom right) for the LIGO Hanford observatory at the time of GW170814. can be written and we find T sin (πn) Ξ(n; α = 0) = (E5) Z ∞ 4πn n dfw f w∗ f n/T n . Ξ( ) = ˜( ) ˜ ( + )( Z) (E2) ( T −∞ ∈ n = 0 = 4 (E6) 0 else Here n = i j and the denominator of Equation 20 is − Ξ(n = 0). 3T sin (πn) Ξ(n; α = 1) = (E7) For the rectangular and Hann windows, the frequency- 2πn(n2 1)(n2 4)  − − domain representations of the windows are 3T n = 0  8  T n = 1 = 4 | | . (E8) T n = 2 sin(πfT )  16 w˜(f; α = 0) = (E3)  | | πf 0 else sin(πfT ) w˜(f; α = 1) = , (E4) This trivially shows us that for a rectangular window 2πf(1 T 2f 2) α α / − ∆i( = 0) = 0 and ∆i( = 1) = 2 3 in the continuum 14

42 770 42.5 770 − (a) − (b) 44 45.0 − 645 − 645

47.5 46 − − 520 520 48

50.0 | − |

− ij ij ¯ C C | |

395 10 395 50 10 52.5 −

− log log Frequency [Hz] Frequency [Hz] 55.0 52 270 − 270 −

57.5 54 − − 145 145 60.0 56 − − 20 20 58 20 145 270 395 520 645 770 20 145 270 395 520 645 770 − Frequency [Hz] Frequency [Hz]

0 770 46 (c) (d) 2 500 − 645 44 4 1000 − 520 42 | |

6 1 ij − − ij ¯ U C |

1500 | 10

395 10

8 log − 40 log Frequency [Hz]

Eigenmode number 2000 10 270 − 38 2500 12 145 − 36 3000 14 − 20 20 145 270 395 520 645 770 20 145 270 395 520 645 770 Frequency [Hz] Frequency [Hz]

FIG. 12. PSD matrix (top left), regularized PSD matrix (top right), SVD eigenmatrix (bottom left), and regularized inverse PSD matrix (bottom right) for the LIGO Livingston observatory at the time of GW170814. limit. For finite duration window functions, the rectan- The generic Tukey window does not have an analytic gular window still gives ∆i(α = 0) = 0 and the Hann Fourier transform, however, numerical experiments con- window still has a maximum for n = 1. We note that firm that for all other values of the α parameter, neigh- | | an equivalent result is shown in [2, 60] in the discussion boring frequency bins are not independent and ∆i is a of asymptotic independence. monotonically increasing function of α. In Figure 14, we show ∆i as a function of α.

[1] Fredric J. Harris, “On the Use of Windows for Harmonic [4] Rainer Dahlhaus, “Small sample effects in time series Analysis with the Discrete Fourier Transform,” IEEE analysis: A new asymptotic theory and a new estimate,” Proceedings 66, 51–83 (1978). The Annals of Statistics 16, 808–841 (1988). [2] D. R. Brillinger, Time Series: Data Analysis and Theory [5] Nidhan Choudhuri, Subhashis Ghosal, and Anindya (SIAM, 2001). Roy, “Contiguity of the Whittle measure for a [3] B. P. Abbott et al., “A guide to LIGO-Virgo detector Gaussian time series,” Biometrika 91, 211–218 noise and extraction of transient gravitational-wave sig- (2004), https://academic.oup.com/biomet/article- nals,” Classical and Quantum Gravity 37, 055002 (2020), pdf/91/1/211/582804/910211.pdf. arXiv:1908.11170 [gr-qc]. [6] Christian R¨over, “Student-t based filter for robust 15

770 40.0 770 40 − − (a) (b) 42.5 42 645 − 645 −

45.0 44 − − 520 520 46

47.5 | − |

− ij ij ¯ C C | 48 | 395 50.0 10 395 − 10

− log log

Frequency [Hz] Frequency [Hz] 50 − 52.5 270 − 270 52 − 55.0 145 − 145 54 − 57.5 − 56 20 20 − 20 145 270 395 520 645 770 20 145 270 395 520 645 770 Frequency [Hz] Frequency [Hz]

0 770 (c) (d) 2 500 − 44 645 4 − 1000 42 6 520

− | | 1 ij − ij ¯ U C |

1500 | 8 40 10 − 395 10 log log

10 Frequency [Hz] Eigenmode number 2000 − 270 38 12 2500 − 145 36 14 − 3000 20 20 145 270 395 520 645 770 20 145 270 395 520 645 770 Frequency [Hz] Frequency [Hz]

FIG. 13. PSD matrix (top left), regularized PSD matrix (top right), SVD eigenmatrix (bottom left), and regularized inverse PSD matrix (bottom right) for the Virgo observatory at the time of GW170814.

signal detection,” Phys. Rev. D 84, 122004 (2011), [11] “A simple introduction to the klt (karhunen—lo`eve trans- arXiv:1109.0442. form),” in Deep Space Flight and Communications: Ex- [7] Adam M. Sykulski, Sofia C. Olhede, Arthur P. Guillau- ploiting the Sun as a Gravitational Lens (Springer Berlin min, Jonathan M. Lilly, and Jeffrey J. Early, “The De- Heidelberg, Berlin, Heidelberg, 2009) pp. 151–179. Biased Whittle Likelihood,” (2016), arXiv:1605.06718 [12] B. P. Abbott et al., “Observation of Gravitational Waves [stat.ME]. from a Binary Black Hole Merger,” Phys. Rev. Lett. 116, [8] Claudia Kirch, Matthew C. Edwards, Alexander Meier, 061102 (2016), arXiv:1602.03837 [gr-qc]. and Renate Meyer, “Beyond Whittle: Nonparametric [13] B. P. Abbott et al., “GW170814: A Three-Detector Ob- correction of a parametric likelihood with a focus on servation of Gravitational Waves from a Binary Black Bayesian time series analysis,” (2017), arXiv:1701.04846 Hole Coalescence,” Phys. Rev. Lett. 119, 141101 (2017), [stat.ME]. arXiv:1709.09660 [gr-qc]. [9] Colm Talbot and Eric Thrane, “Gravitational-wave as- [14] R. Abbott et al., “GW190521: A Binary Black Hole tronomy with an uncertain noise power spectral den- Merger with a Total Mass of 150 M ,” Phys. Rev. Lett. sity,” Physical Review Research 2, 043298 (2020), 125, 101102 (2020), arXiv:2009.01075 [gr-qc]. arXiv:2006.05292 [astro-ph.IM]. [15] J. Aasi et al., “Advanced LIGO,” Classical and Quantum [10] Suhasini Subba Rao and Junho Yang, “Reconciling the Gravity 32, 074001 (2015), arXiv:1411.4547 [gr-qc]. Gaussian and Whittle Likelihood with an application to [16] F. Acernese et al., “Advanced Virgo: a second- estimation in the frequency domain,” arXiv e-prints , generation interferometric detector,” arXiv:2001.06966 (2020), arXiv:2001.06966 [math.ST]. Classical and Quantum Gravity 32, 024001 (2015), 16

frequency-domain model for the gravitational wave sig- nal from nonprecessing black-hole binaries,” Phys. Rev. 0.6 D 102, 064002 (2020), arXiv:2001.10914 [gr-qc]. [26] Geraint Pratten, Cecilio Garc´ıa-Quir´os,Marta Colleoni, Antoni Ramos-Buades, H´ectorEstell´es, Maite Mateu-

) 0.4 Lucena, Rafel Jaume, Maria Haney, David Keitel, α ( i Jonathan E. Thompson, and Sascha Husa, “Computa- ∆ tionally efficient models for the dominant and subdomi- 0.2 nant harmonic modes of precessing binary black holes,” Phys. Rev. D 103, 104056 (2021), arXiv:2004.06503 [gr- qc]. 0.0 [27] Ethan Payne, Colm Talbot, Paul D. Lasky, Eric Thrane, 0.0 0.2 0.4 0.6 0.8 1.0 and Jeffrey S. Kissel, “Gravitational-wave astronomy α with a physical calibration model,” Phys. Rev. D 102, 122004 (2020), arXiv:2009.10193 [astro-ph.IM]. [28] Salvatore Vitale, Carl-Johan Haster, Ling Sun, Ben Farr, FIG. 14. Fractional contamination for white noise (Equa- Evan Goetz, Jeff Kissel, and Craig Cahillane, “Phys- tion 20) as a function of Tukey α parameter. We note that ical approach to the marginalization of LIGO calibra- the bias is a monotonic function of α ranging from ∆i = 0 for tion uncertainties,” Phys. Rev. D 103, 063016 (2021), α = 0 to ∆i ≈ 2/3 for α = 1. arXiv:2009.10192 [gr-qc]. [29] Sylvia Biscoveanu, Carl-Johan Haster, Salvatore Vitale, and Jonathan Davies, “Quantifying the effect of power arXiv:1408.3978 [gr-qc]. spectral density uncertainty on gravitational-wave pa- [17] We note that here we use two-sided discrete Fourier rameter estimation for compact binary sources,” Phys. transforms rather than the one-sided version widely used Rev. D 102, 023008 (2020), arXiv:2004.05149 [astro- in gravitational-wave data analysis. ph.HE]. [18] J. Aasi et al., “Improved Upper Limits on the Stochastic [30] Katerina Chatziioannou, Carl-Johan Haster, Tyson B. Gravitational-Wave Background from 2009-2010 LIGO Littenberg, Will M. Farr, Sudarshan Ghonge, Margaret and Virgo Data,” Phys. Rev. Lett. 113, 231101 (2014), Millhouse, James A. Clark, and Neil Cornish, “Noise arXiv:1406.4556 [gr-qc]. spectral estimation methods and their impact on gravi- [19] P. B. Covas, A. Effler, E. Goetz, P. M. Meyers, A. Neun- tational wave measurement of compact binary mergers,” zert, M. Oliver, B. L. Pearlstone, V. J. Roma, R. M. S. Phys. Rev. D 100, 104004 (2019), arXiv:1907.06540 [gr- Schofield, V. B. Adya, et al., “Identification and miti- qc]. gation of narrow spectral artifacts that degrade searches [31] Michael P¨urrerand Carl-Johan Haster, “Gravitational for persistent gravitational waves in the first two observ- waveform accuracy requirements for future ground-based ing runs of Advanced LIGO,” Phys. Rev. D 97, 082002 detectors,” Physical Review Research 2, 023151 (2020), (2018), arXiv:1801.07204 [astro-ph.IM]. arXiv:1912.10055 [gr-qc]. [20] The covariance matrix is non-invertible as the window [32] Christopher J. Moore, Eliot Finch, Riccardo Buscic- application is not reversible. There is no way to recover chio, and Davide Gerosa, “Testing general relativity the value of the time series for points at which the data with gravitational-wave catalogs: The insidious nature have been zeroed, e.g., the end points.). of waveform systematics,” iScience 24, 102577 (2021), [21] Eric Thrane and Colm Talbot, “An introduction to arXiv:2103.16486 [gr-qc]. Bayesian inference in gravitational-wave astronomy: Pa- [33] Sebastian M. Gaebel, John Veitch, Thomas Dent, and rameter estimation, model selection, and hierarchical Will M. Farr, “Digging the population of compact binary models,” Pub. Astron. Soc. Aust. 36, e010 (2019), mergers out of the noise,” Mon. Not. R. Ast. Soc. 484, arXiv:1809.02293 [astro-ph.IM]. 4008–4023 (2019), arXiv:1809.03815 [astro-ph.IM]. [22] Rory Smith and Eric Thrane, “Optimal Search for an [34] Rory J. E. Smith, Colm Talbot, Francisco Hernandez Vi- Astrophysical Gravitational-Wave Background,” Physi- vanco, and Eric Thrane, “Inferring the population prop- cal Review X 8, 021019 (2018), arXiv:1712.00688 [gr-qc]. erties of binary black holes from unresolved gravitational [23] Duncan Macleod, Alex L. Urban, Scott Coughlin, waves,” Mon. Not. R. Ast. Soc. 496, 3281–3290 (2020), Thomas Massinger, Matt Pitkin, rngeorge, paulaltin, arXiv:2004.09700 [astro-ph.HE]. Joseph Areeda, Leo Singer, Eric Quintero, Katrin Lein- [35] Sharan Banagiri, Vuk Mandic, Claudia Scarlata, and weber, and The Gitter Badger, “gwpy/gwpy: 2.0.4,” Kate Z. Yang, “Measuring angular N -point correlations (2021). of binary black hole merger gravitational-wave events [24] Geraint Pratten, Sascha Husa, Cecilio Garcia-Quiros, with hierarchical Bayesian inference,” Phys. Rev. D 102, Marta Colleoni, Antoni Ramos-Buades, Hector Estelles, 063007 (2020), arXiv:2006.00633 [astro-ph.CO]. and Rafel Jaume, “Setting the cornerstone for a fam- [36] Paul D. Lasky, Eric Thrane, Yuri Levin, Jonathan Black- ily of models for gravitational waves from compact bi- man, and Yanbei Chen, “Detecting Gravitational-Wave naries: The dominant harmonic for nonprecessing quasi- Memory with LIGO: Implications of GW150914,” Phys. circular black holes,” Phys. Rev. D 102, 064001 (2020), Rev. Lett. 117, 061102 (2016), arXiv:1605.01415 [astro- arXiv:2001.11412 [gr-qc]. ph.HE]. [25] Cecilio Garc´ıa-Quir´os, Marta Colleoni, Sascha Husa, [37] We emphasize that this method trivially extends to the H´ectorEstell´es,Geraint Pratten, Antoni Ramos-Buades, more generic case of an unknown population of signals, Maite Mateu-Lucena, and Rafel Jaume, “Multimode however, we make this assumption to isolate the impact 17

of the covariance matrix. tional wave data,” Phys. Rev. D 102, 124038 (2020), [38] The masses presented here are “detector-frame” quanti- arXiv:2009.00043 [gr-qc]. ties which are larger than the source-frame masses due [49] Rich Ormiston, Tri Nguyen, Michael Coughlin, Rana X. to cosmological . Adhikari, and Erik Katsavounidis, “Noise reduction in [39] Gregory Ashton and Sebastian Khan, “Multiwaveform gravitational-wave data via deep learning,” Physical Re- inference of gravitational waves,” Phys. Rev. D 101, view Research 2, 033066 (2020), arXiv:2005.06534 [astro- 064037 (2020), arXiv:1910.09138 [gr-qc]. ph.IM]. [40] A. Z. Jan, A. B. Yelikar, J. Lange, and [50] Katerina Chatziioannou, Neil Cornish, Marcella Wijn- R. O’Shaughnessy, “Assessing and marginalizing gaarden, and Tyson B. Littenberg, “Modeling com- over compact binary coalescence waveform systemat- pact binary signals and instrumental glitches in gravi- ics with RIFT,” Phys. Rev. D 102, 124069 (2020), tational wave data,” Phys. Rev. D 103, 044013 (2021), arXiv:2011.03571 [gr-qc]. arXiv:2101.01200 [gr-qc]. [41] H´ector Estell´es, Sascha Husa, Marta Colleoni, Maite [51] O Edy, A. Lundgren, and L. K. Nuttall, “The Issues Mateu-Lucena, Maria de Lluc Planas, Cecilio Garc´ıa- of Mismodelling Gravitational-Wave Data for Parameter Quir´os,David Keitel, Antoni Ramos-Buades, Ajit Kumar Estimation,” (2021), arXiv:2101.07743 [astro-ph.IM]. Mehta, Alessandra Buonanno, and Serguei Ossokine, [52] Kentaro Mogushi, “Reduction of transient noise artifacts “A detailed analysis of GW190521 with phenomenologi- in gravitational-wave data using deep learning,” arXiv cal waveform models,” arXiv e-prints , arXiv:2105.06360 e-prints , arXiv:2105.10522 (2021), arXiv:2105.10522 [gr- (2021), arXiv:2105.06360 [gr-qc]. qc]. [42] Gregory Ashton, Eric Thrane, and R. J. E. Smith, [53] Joseph D. Romano and Neil. J. Cornish, “Detection “Gravitational wave detection without boot straps: a methods for stochastic gravitational-wave backgrounds: Bayesian approach,” Phys. Rev. D 100, 123018 (2019). a unified treatment,” Living Reviews in Relativity 20, 2 [43] Tyson B. Littenberg and Neil J. Cornish, “Bayesian (2017), arXiv:1608.06889 [gr-qc]. inference for spectral estimation of gravitational wave [54] Magdalena Sieniawska and Micha l Bejger, “Continu- detector noise,” Phys. Rev. D 91, 084034 (2015), ous Gravitational Waves from Neutron Stars: Cur- arXiv:1410.3852 [gr-qc]. rent Status and Prospects,” Universe 5, 217 (2019), [44] Sharan Banagiri, Michael W. Coughlin, James Clark, arXiv:1909.12600 [astro-ph.HE]. Paul D. Lasky, M. A. Bizouard, Colm Talbot, [55] B. P. Abbott et al., “GW170817: Observation of Grav- Eric Thrane, and Vuk Mandic, “Constraining the itational Waves from a Binary Inspiral,” gravitational-wave afterglow from a binary neutron star Phys. Rev. Lett. 119, 161101 (2017), arXiv:1710.05832 coalescence,” Mon. Not. R. Ast. Soc. 492, 4945–4951 [gr-qc]. (2020), arXiv:1909.01934 [astro-ph.IM]. [56] R. Abbott et al., “Upper limits on the isotropic [45] V. Tiwari, M. Drago, V. Frolov, S. Klimenko, G. Mitsel- gravitational-wave background from Advanced LIGO makher, V. Necula, G. Prodi, V. Re, F. Salemi, G. Ve- and Advanced Virgo’s third observing run,” Phys. Rev. dovato, and I. Yakushin, “Regression of environmental D 104, 022004 (2021), arXiv:2101.12130 [gr-qc]. noise in LIGO data,” Classical and Quantum Gravity 32, [57] J. Zweizig and K. Riles, “Information on self-gating 165014 (2015), arXiv:1503.07476 [gr-qc]. of h(t) used in O3 continuous-wave and stochastic [46] Chris Pankow, Katerina Chatziioannou, Eve A. Chase, searches,” (2020). Tyson B. Littenberg, Matthew Evans, Jessica McIver, [58] Michele Vallisneri, Jonah Kanner, Roy Williams, Alan Neil J. Cornish, Carl-Johan Haster, Jonah Kanner, Weinstein, and Branson Stephens, “The LIGO Open Vivien Raymond, Salvatore Vitale, and Aaron Zimmer- Science Center,” J. Phys. Conf. Ser. 610, 012021 (2015), man, “Mitigation of the instrumental noise transient in arXiv:1410.4839. gravitational-wave data surrounding GW170817,” Phys. [59] The LIGO Scientific Collaboration, R. the Virgo Collab- Rev. D 98, 084016 (2018), arXiv:1808.03619 [gr-qc]. oration and Abbott, et al., “Open data from the first and [47] Barak Zackay, Tejaswi Venumadhav, Javier Roulet, second observing runs of Advanced LIGO and Advanced Liang Dai, and Matias Zaldarriaga, “Detecting Gravita- Virgo,” (2019), arXiv:1912.11716 [gr-qc]. tional Waves in Data with Non-Gaussian Noise,” (2019), [60] S. N. Lahiri, “A necessary and sufficient condition for arXiv:1908.05644 [astro-ph.IM]. asymptotic independence of discrete fourier transforms [48] Neil J. Cornish, “Time-frequency analysis of gravita- under short- and long-range dependence,” The Annals of Statistics 31, 613–641 (2003).