Arxiv:2106.13785V1 [Astro-Ph.IM] 25 Jun 2021
Total Page:16
File Type:pdf, Size:1020Kb
Inference with finite time series: Observing the gravitational Universe through windows Colm Talbot,1, 2, 3,a Eric Thrane,4, 5 Sylvia Biscoveanu,2 and Rory Smith4, 5 1LIGO Laboratory, California Institute of Technology, Pasadena, CA 91125, USA 2LIGO Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA 3Kavli Institute for Astrophysics and Space Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA 4School of Physics and Astronomy, Monash University, VIC 3800, Australia 5OzGrav: The ARC Centre of Excellence for Gravitational-Wave Discovery, Clayton, VIC 3800, Australia Time series analysis is ubiquitous in many fields of science including gravitational-wave astronomy, where strain time series are analyzed to infer the nature of gravitational-wave sources, e.g., black holes and neutron stars. It is common in gravitational-wave transient studies to apply a tapered window function to reduce the effects of spectral artifacts from the sharp edges of data segments. We show that the conventional analysis of tapered data fails to take into account covariance between frequency bins, which arises for all finite time series|no matter the choice of window function. We discuss the origin of this covariance and derive a framework that models the correlation induced by the window function. We demonstrate this solution using both simulated Gaussian noise and real Advanced LIGO/Advanced Virgo data. We show that the effect of these correlations is similar in scale to widely studied systematic errors, e.g., uncertainty in detector calibration and power spectral density estimation. I. INTRODUCTION for the correlations between neighboring frequencies in- troduced by the window function applied to obtain finite Time-series analysis underpins recent advances in time series. We show how correlations between frequency gravitational-wave astronomy. The vast majority of bins arise from the fact that quasi-stationary Gaussian gravitational-wave data analysis relies on windowing, a noise processes are fundamentally described in the fre- procedure that multiplies the time-domain data segment quency domain by continuous functions, which imply by a window function that tapers off at the beginning infinite-duration time series. We derive a simple expres- and end of the segment. Analysts apply tapered win- sion for the “finite-duration” covariance matrix, which dows to mitigate two effects: (1) spectral artifacts aris- encodes the correlations naturally present in all finite ing from the Fourier transform of the data segment edges time series and identify our result as a specific basis for (Gibbs phenomena) and (2) correlations between neigh- a Karhunen-Lo`eve transformation (KLT) (see, e.g., [11]). boring frequency bins. While correlations between neigh- We show that there are practical applications where the boring frequency bins can be reduced, they are never current conventional approach of windowing data incurs eliminated. systematic errors, which though small, produce invalid Choosing a suitable window requires balancing vari- inferences when data are combined in large sets or when ous considerations including the spectral leakage from we analyze gravitational-wave events with high signal-to- instrumental lines and the low-frequency \seismic wall," noise ratio (SNR). effectiveness mitigating the Gibbs phenomenon, and the loss of signal. For a systematic study of the properties of different windows, we refer the reader to, e.g., [1,2] for theoretical introductions, and [3] for a specific discus- The remainder of this paper is organized as follows. sion in the context of gravitational-wave data analysis. In Section II, we present the formalism underlying our Once the data are windowed, they are typically analyzed framework. We derive the finite-duration covariance ma- in the frequency domain where the noise is described by trix for the analysis of finite time series. In Section III, arXiv:2106.13785v2 [astro-ph.IM] 29 Sep 2021 a power spectral density (PSD), and it is assumed that we perform a demonstration, applying our method to each frequency bin is statistically independent. However, the binary black hole merger events, GW150914 [12], this assumption is not true for a finite stretch of a longer GW170814 [13], and GW190521 [14] and contrast with noise process. results neglecting covariances. We show how the cur- The assumptions underpinning the Whittle approxi- rent conventional windowing procedure can lead to faulty mation have been thoroughly studied and many refine- inferences when many gravitational-wave measurements ments have been proposed (e.g., [4{10]). In this paper, are combined. While our demonstration uses data from we derive from first principles a formalism which accounts gravitational-wave astronomy, the framework we put for- ward is broadly applicable to all time-domain analysis. We show that the problem is fixed by using the finite- duration covariance matrix. We provide closing thoughts a [email protected] in SectionIV. 2 II. FORMALISM We note that, even if the true covariance matrix is di- agonal, a naive empirical estimate of this quantity neces- In this section, we derive a likelihood that enables sarily has some statistical uncertainty and will not generi- us to analyze time-series data characterized by station- cally be diagonal, even for a stationary Gaussian process. ary, Gaussian noise in a way that correctly takes into We emphasize that in this section we seek to derive ex- account covariance between neighboring frequency bins pressions for the true noise covariance matrix assuming µν is known. and neglect the impact of empirical esti- that arises for all finite time series. C mates. In AppendixB, we describe how we estimate µν in practice. C A. Basic notation We consider time-series data d(t) consisting of signal s(t) and noise n(t) C. Non-continuous, finite-duration noise d(t) = s(t) + n(t): (1) In practice, we only consider finite stretches of data. In In gravitational-wave observatories like LIGO [15] and this subsection, we derive the properties of finite stretches Virgo [16], n(t) is a time series of dimensionless of continuous noise. Let us consider data measured with strain (change in length per unit length). Transient sampling rate fs over data segment duration T . There gravitational-wave signals from merging binaries are are characterized by comparing the data to gravitational waveform templates h(t). N = fsT (5) B. Continuous, infinite-duration noise independent measurements. We assume that the noise has no content above half the sampling rate and so we To start, we focus on noise in the absence of signals. can probe every frequency without aliasing. In practice, The noise can be expressed in the frequency domain as applying an aggressive low-pass filter removes this high- frequency content. Z 1 n~(f) = dt e−2πiftn(t): (2) These data can be represented either in the time- −∞ domain as a real N-component time series with spacing 1=fs or in the frequency domain as a complex frequency The noise is best described as continuous because it is series d~i with fs=2 f fs=2 with spacing 1=T where defined for an arbitrary choice of frequency: with a suf- the endpoints− and zero-frequency≤ ≤ component are required ficiently long measurement, it is possible in principle to to be real. These two domains are related via the discrete achieve sufficient resolution to measuren ~(f) for any value Fourier transform [17] of f. If we assume the noise is Gaussian, the likelihood of N−1 observing a specific noise realization is characterized by 1 X d~0 = d0 e−2πijk=N : (6) a covariance matrix k f j s j=0 1 ∗ µν = n~(fµ)~n (fν ) (3) C 2 h i The frequency-domain covariance matrix for finite- the diagonal of which is equal to the noise PSD duration, non-continuous noise is: S µ = diag ( µν ) : (4) 1 D 0 ∗0E S C Cij = d~ d~ : (7) 2 i j We refer to µν as the “infinite-duration” covariance matrix. It is definedC continuously for arbitrary values Here, the angled brackets denote ensemble averages over of fµ and fν and, in the time domain, it is defined for all times: ( ; + ). Throughout, repeated indices are noise realizations. The widely-used Whittle approxima- summed over−∞ unless1 otherwise specified. In the next sub- tion assumes that data at each of the analyzed frequen- cies are independent, i.e., C is a diagonal matrix. This is section, we contrast µν (calligraphic script and greek ij indices) with the finite-durationC covariance matrix, de- generally a good approximation. However, as we show in this paper, the assumption of independence is not strictly noted Cij (no calligraphic script and roman indices), valid when analyzing a finite stretch of data, especially which is defined only for discrete frequency bins fi and when using a tapered window. fj (or equivalently, for a finite duration). If we further assume that the noise is stationary (the PSD does not We begin by defining our window function w, which vary in time), µν is diagonal. describes how we measure some segment of noise from C 3 what is, in theory, an infinite-duration noise process: 42 (a) − 1 ~0 1 X −2πi k=N dk = d w e (8) 520 44 fs − =−∞ N 1 X −2πijk=N 46 = djwje (9) 510 − | fs ij C j=0 | ~ 48 10 = (d w~)k: (10) log ∗ 500 − Frequency [Hz] w Here, j is a time-domain window function and the 50 − frequency-domain noise is now the convolution of the 490 original frequency-domain noise with the Fourier trans- formed window function. The prime denotes quantities 52 − associated with the windowed data. 480 We stress that this window function is always present 480 490 500 510 520 Frequency [Hz] in gravitational-wave data analysis problems and is de- 42 fined for all times, not just the analysis segment.