A&A 496, 577–584 (2009) Astronomy DOI: 10.1051/0004-6361:200811296 & c ESO 2009 Astrophysics

The generalised Lomb-Scargle periodogram A new formalism for the floating- and Keplerian periodograms M. Zechmeister and M. Kürster

Max-Planck-Institut für Astronomie, Königstuhl 17, 69117 Heidelberg, Germany e-mail: [email protected] Received 5 November 2008 / Accepted 19 December 2008

ABSTRACT

The Lomb-Scargle periodogram is a common tool in the frequency analysis of unequally spaced data equivalent to least-squares fitting of sine waves. We give an analytic solution for the generalisation to a full sine wave fit, including an offset and weights (χ2 fitting). Compared to the Lomb-Scargle periodogram, the generalisation is superior as it provides more accurate frequencies, is less susceptible to aliasing, and gives a much better determination of the spectral intensity. Only a few modifications are required for the computation and the computational effort is similar. Our approach brings together several related methods that can be found in the literature, viz. the date-compensated discrete Fourier transform, the floating-mean periodogram, and the “spectral significance” estimator used in the SigSpec program, for which we point out some equivalences. Furthermore, we present an algorithm that implements this generalisation for the evaluation of the Keplerian periodogram that searches for the period of the best-fitting Keplerian orbit to radial velocity data. The systematic and non-random algorithm is capable of detecting eccentric orbits, which is demonstrated by two examples and can be a useful tool in searches for the orbital periods of exoplanets. Key words. methods: data analysis – methods: analytical – methods: statistical – techniques: radial velocities

1. Introduction the mean of the data was subtracted, which assumes that the mean of the data and the mean of the fitted sine function are the The Lomb-Scargle periodogram (Scargle 1982) is a widely used same. One can overcome this assumption with the introduction tool in period searches and frequency analysis of . It is ff y = ω + ω of an o set c, resulting in a further generalisation of this peri- equivalent to fitting sine waves of the form a cos t b sin t. odogram to the equivalent of weighted full sine wave fitting; i.e., While standard fitting procedures require the solution of a set of y = a cos ωt + b sin ωt + c. Cumming et al. (1999), who called linear equations for each sampled frequency, the Lomb-Scargle this generalisation “floating-mean periodogram”, argue that this method provides an analytic solution and is therefore both con- ffi approach is superior: “... the Lomb-Scargle periodogram fails to venient to use and e cient. The equation for the periodogram account for statistical fluctuations in the mean of a sampled si- was given by Barning (1963), and also Lomb (1976)andScargle nusoid, making it non-robust when the number of observations (1982), who furthermore investigated its statistical behaviour, is small, the sampling is uneven, or for periods comparable to especially the statistical significance of the detection of a signal. y y = or greater than the duration of the observations”. These authors For a time series (ti, i) with zero mean ( 0), the Lomb- provided a formal definition and also a sophisticated statistical Scargle periodogram is defined as (normalisation from Lomb treatment, but do not use an analytical solution for the computa- 1976): tion of this periodogram. ⎡ ⎤ ⎢ 2 2 ⎥ Basically, analytical formulae for a full sine, least-squares 1 ⎢ YCˆ τ YSˆ τ ⎥ pˆ(ω) = ⎣⎢ ˆ + ˆ ⎦⎥ (1) spectrum have already been given by Ferraz-Mello (1981), call- YYˆ CCˆ τˆ SSˆ τˆ ing this date-compensated discrete Fourier transform (DCDFT). ⎧     ⎫ ⎪ 2 2 ⎪ We prefer to adopt a notation closely related to the Lomb- 1 ⎨ i yi cos ω(ti − τˆ) i yi sin ω(ti − τˆ) ⎬ =  ⎪  +  ⎪(2) Scargle periodogram calling it the generalised Lomb-Scargle pe- y2 ⎩ 2 ω − τ 2 ω − τ ⎭ i i i cos (ti ˆ) i sin (ti ˆ) riodogram (GLS). Shrager (2001) tries for such an approach but did not generalise the parameterτ ˆ in Eq. (3). Moreover, our gen- where the hats are used in this paper to symbolise the classical eralised equations, which are derived in the following (Sect. 2), expressions. The parameterτ ˆ is calculated via have a comparable symmetry to the classical ones and also allow  us to point out equivalences to the “spectral significance” estima- sin 2ωt tan 2ωτˆ = i i · (3) tor used in the SigSpec program by Reegen (2007) (Sect. 4). i cos 2ωti However, there are two shortcomings. First, the Lomb- 2. The generalised Lomb-Scargle periodogram Scargle periodogram does not take the measurement errors into (GLS) account. This was solved by introducing weighted sums by Gilliland & Baliunas (1987)andIrwin et al. (1989) (equiva- The analytic solution for the generalised Lomb-Scargle peri- lent to the generalisation to a χ2 fit). Second, for the analysis odogram can be obtained in a straightforward manner in the

Article published by EDP Sciences 578 M. Zechmeister and M. Kürster: The generalised Lomb-Scargle periodogram same way as outlined in Lomb (1976). Let yi be the N mea- So the sums YC and YS use the weighted mean subtracted data surements of a time series at time ti and with errors σi. Fitting a and are calculated in the same way as for the Lomb-Scargle pe- full sine function (i.e. including an offset c): riodogram (but with weights). ω y(t) = a cos ωt + b sin ωt + c The generalised Lomb-Scargle periodogram p( )inEq.(4) is normalised to unity and therefore in the range of 0 ≤ p ≤ 1, 2π at given frequency ω (or period P = ω ) to minimise with p = 0 indicating no improvement of the fit and p = 1a 2 2 the squared difference between the data yi and the model func- “perfect” fit (100% reduction of χ or χ = 0). tion y(t): As the full sine fit is time-translation invariant, there is also the possibility to introduce an arbitrary time reference N y − y 2   2 [ i (ti)] 2 τ t → t − τ CC = w 2 ω t − τ − χ = = W wi[yi − y(ti)] point ( i i ;now,e.g. i cos ( i ) σ2 w ω − τ 2 ff χ2 i=1 i ( i cos (ti )) ), which will not a ect the of the fit. If this parameter τ is chosen as where   1 1   w = = 1 w = ωτ = 2CS i W σ2 i 1 tan 2 σ2 i − W i CC SS   w sin 2ωt −2 w cos ωt w sin ωt are the normalised weights1. Minimisation leads to a system of =  i i  i i  i i  (19) 2 2 (three) linear equations whose solution is derived in detail in wi cos 2ωti − ( wi cos ωti) −( wi sin ωti) Appendix A.1. Furthermore, it is shown in Appendix A.1 that  2 = w ω − the relative χ -reduction p(ω) as a function of frequency ω and the interaction term in Eq. (5) disappears, CS τ i cos (ti χ2 χ2 τ)sinω(ti − τ) − wi cos ω(ti − τ) wi sin ω(ti − τ) = 0 (proof normalised to unity by 0 (the for the weighted mean) can be written as: in Appendix A.2) and in this case we append the index τ to the τ ω χ2 − χ2 ω time dependent sums. The parameter ( ) is determined by the 0 ( ) σ ω p(ω) = (4) times ti and the measurement errors i for each frequency .So χ2 τ 0 when using as defined in Eq. (19) the periodogram in Eq. (5) 1   becomes p(ω) = SS · YC2 + CC · YS 2 − 2CS · YC · YS (5)   YY · D YC2 YS 2 ω = 1 τ + τ · with: p( ) (20) YY CCτ SSτ D(ω) = CC · SS − CS 2 (6) Note that Eq. (20) has the same form as the Lomb-Scargle pe- and the following abbreviations for the sums: ff  riodogram in Eq. (1) with the di erence that the errors can be weighted (weights wi in all sums) and that there is an additional Y = wiyi (7)  second term in CCτ, SSτ, CS τ and tan 2ωτ (Eqs. (13)−(15) C = wi cos ωti (8) and (19), respectively) which accounts for the floating mean.  The computational effort is similar as for the Lomb-Scargle = w ω S i sin ti (9) periodogram. The incorporation of the offset c requires only two ω = w ω  additional sums for each frequency (namely S i sin ti = w ω ff YY = YYˆ − Y · Y YYˆ = w y2 (10) and C i cos ti or S τ and Cτ respectively). The e ort is  i i even weaker when using Eq. (5) with keeping CS instead of us- τ YC(ω) = YCˆ − Y · C YCˆ = wiyi cos ωti (11) ing Eq. (20) with the parameter introduced via Eq. (19)which  needs an extra preceding loop in the algorithm. If the errors are YS(ω) = YSˆ − Y · S YSˆ = wiyi sin ωti (12) taken into account as weights, also the multiplication with w  i must be done. CC(ω) = CCˆ − C · C CCˆ = w cos2 ωt (13)  i i For fast computation of the trigonometric sums the algorithm SS(ω) = SSˆ − S · S SSˆ = w sin2 ωt (14) of Press & Rybicki (1989) can be applied, which has advan-  i i tages in the case of large data sets and/or many frequency steps. 2 CS(ω) = CSˆ − C · S CSˆ = wi cos ωti sin ωti. (15) Another possibility are trigonometric recurrences as described in Press et al. (1992). Note also that the first sum in SS can be Note that sums with hats correspond to the classical sums. expressed by SSˆ = 1 − CCˆ . · ≡ χ2 W YY 0 is simply the weighted sum of squared devia- tions from the weighted mean. The mixed sums can also be writ- ten as a weighted covariance Covx,y = wi xiyi − X · Y/W = 3. Normalisation and False-Alarm probability (FAP) E(x · y) − WE(x)E(y)whereE is the expectation value, e.g. = There were several discussions in the literature on how to nor- YS Covy,sin ωt.  malise the periodogram. For the detailed discussion we refer to With the weighted mean given by y = wiyi = Y Eqs. (10)−(12) can also be written as: the key papers by Scargle (1982), Horne & Baliunas (1986),  Koen (1990)andCumming et al. (1999). The normalisation be- 2 YY = wi(yi − y) (16) comes important for estimations of the false-alarm probability  of a signal by means of an analytic expression. Lomb (1976) YC(ω) = w (y − y)cosωt (17) 2  i i i showed that if data are Gaussian noise, the terms YCˆ /CCˆ and 2 2 YS(ω) = wi(yi − y)sinωti. (18) YSˆ /SSˆ in Eq. (1)areχ -distributed and therefore the sum of

1 2 For clarity the bounds of the summation are suppressed in the fol- E.g. cos ωk+1t = cos(ωk +Δω)t = cos ωkt cos Δωkt − sin ωkt sin Δωt lowing notation. They are always the same (i = 1, 2, ..., N). where Δω is the frequency step. M. Zechmeister and M. Kürster: The generalised Lomb-Scargle periodogram 579

2 both (which is ∝p)isχ -distributed with two degrees of free- Table 1. Probabilities that a periodogram power (Pn, p, P or z) can dom. This holds for the generalisation in Eq. (20) and also be- exceed a given value (Pn,0, p0, P0 or z0) for different normalizations comes clear from the definition of the periodogram in Eq. (4) (from Cumming et al. 1999). χ2−χ2(ω) ω = 0 ff p( ) χ2 where for Gaussian noise the di erence in the 0 Reference level Range Probability χ2−χ2 ω χ2 ν = − − − = numerator 0 ( )is -distributed with (N 1) (N 3) population Pn ∈ [0, ∞) Prob(Pn > Pn,0) = exp(−Pn,0) 2 degrees of freedom. N−3 sample variance p ∈ [0, 1] Prob(p > p0) = (1 − p0) 2 − The p(ω) can be compared with a known noise level pn (ex- N 3 –”– P ∈ [0, N−1 ] Prob(P > P ) = 1 − 2P 2 pected from the a priori known noise variance or population vari- 2 0 N−1 − N−3 2z 2 ance) and the normalisation of p(ω)topn ∈ , ∞ > = + 0 residual variance z [0 ) Prob(z z0) 1 N−3 p(ω) Pn = (21) pn can be considered as a signal to noise ratio (Scargle 1982). Since in period search with the periodogram we study a However, this noise level is often not known. range of frequencies, we are also interested in the significance of Alternatively, the noise level may be estimated for Gaussian one peak compared to the peaks at other frequencies rather than = 2 the significance of a single frequency. The false alarm probabil- noise from Eq. (4)tobepn N−1 which leads to: ity (FAP) for the period search in a frequency range is given by N − 1 P = p(ω) (22) = − − > M 2 FAP 1 [1 Prob(z z0)] (24) and is the analogon to the normalisation in Horne & Baliunas where M is the number of independent frequencies and may be (1986)3. So if the data are noise, P = 1 is the expected value. estimated as the number of peaks in the periodogram. The width  δ ≈ 1 ≈ If the data contains a signal, P 1 is expected at the signal of one peak is f T ( frequency resolution). So in the fre- frequency. However, this power is restricted to 0 ≤ P ≤ N−1 . Δ = − = Δ f 2 quency range f f2 f1 there are approximately M δ f But if the data contains a signal or if errors are under- or peaks and in the case f f one can write M ≈ Tf (Cumming = 2 1 2 2 overestimated or if intrinsic variability is present, then pn N−1 2004). Finally, for low FAP values the following handy approx- may not be a good uncorrelated estimator for the noise level. imation for Eq. (24) is valid: Cumming et al. (1999) suggested to estimate the noise level a posteriori with the residuals of the best fit and normalised the FAP ≈ M · Prob(z > z0)forFAP 1. (25) periodogram as Another possibility to access M and the FAP are N − 3 χ2 − χ2(ω) N − 3 p(ω) Monte Carlo or bootstrap simulations in order to determine how z ω = 0 = ( ) 2 (23) 2 χ 2 1 − pbest often a certain power level (z0) is exceeded just by chance. best Such numerical calculation of the FAP are much more time- where the index “best” denotes the corresponding values of the consuming than the actual computation of the GLS. best fit (pbest = p(ωbest)). Statistical fluctuations or a present signal may lead to a larger periodogram power. To test for the significance of such a peak in 4. Equivalences between the GLS and SigSpec the periodogram the probability is assessed that this power can (Reegen 2007) arise purely from noise. Cumming et al. (1999) clarified that the different normalisations result in different probability functions Reegen (2007) developed a method, called SigSpec, to deter- which are summarised in Table 1. Note that the last two proba- mine the significance of a peak at a given frequency (spectral sig- bility values are the same for the best fit (z = z ): nificance) in a discrete Fourier transformation (DFT) which in- 0 best cludes a zero mean correction. We will recapitulate some points  − − /  − − / from his paper in an adapted and shortened way in order to show 2z (N 3) 2 p (N 3) 2 Prob(z > z ) = 1 + best = 1 + best several equivalences and to disentangle different notations used best N − 3 1 − p   best by us and used by Reegen (2007). For a detailed description −(N−3)/2 we refer to the original paper. Briefly, approaching from Fourier 1 − / = = (1 − p )(N 3) 2 theory Reegen (2007) defined the zero mean corrected Fourier − p best 1 best coefficients4 = > = > . Prob(p pbest) Prob(P Pbest)    1 1 a (ω) = y cos ωt − y · cos ωt Furthermore Baluev (2008) pointed out that the power definition ZM N i i N2 i i    1 1 2 b ω = y ωt − y · ωt N − 2 χ ZM( ) i sin i 2 i sin i Z(ω) = ln 0 N N 3 χ2(ω) 3 These authors called it the normalization with the sample variance 2 σ2 ω χ2 as a nonlinear (logarithmic) scale for χ has an exponential dis- 0. Note that p( ) is already normalized with 0. For the unweighted w = 1 σ2 = N ω = tribution (similiar to Pn) case ( i N , 0 N−1 YY) one can write Eqs. (22) with (20)asP( ) 2 2 1 N YC + YS . ⎛ ⎞ − / 2 σ2 CC SS ⎜χ2 ω ⎟(N 3) 2 0 −Z ⎜ ( )⎟ 4 Here only the unweighted case is discussed (w = 1 ). Reegen (2007) Prob(Z > Z ) = e = ⎝⎜ ⎠⎟ = Prob(p > p ). i N best χ2 best also gives a generalization to weighting. 0 580 M. Zechmeister and M. Kürster: The generalised Lomb-Scargle periodogram which obviously correspond to YC and YS in Eqs. (11)and(12). the equivalence of the GLS and the spectral significance estima- Their are given by tor in SigSpec (and with it to least squares) is evident:   y2   YY · N log e 2 2 1 2 sig(a , b |ω) = · p(ω). a = cos ωt − cos ωt ZM ZM y2 ZM N2 i N i 2 The two indicators differ only in a normalisation factor, which   y2   N−1 2 1 2 becomes log e,when y is estimated with the sample vari- b2 = sin2 ωt − sin ωt . 2 ZM 2 i i y2 = N N N ance i N−1 YY. Therefore, SigSpec gives accurate fre- quencies just like least squares. But note that the Fourier am- The precise value of these variances depends on the temporal y2 ffi 2 plitude is given by the sum of the squared Fourier coe cient sampling. These variances can be expressed as a = CC 2 = 2 + 2 = α2 + β2 ∼ 2 + 2 = 2 + 2 ZM N A aZM bZM 4 4 YC YS YCτ YSτ while y2 2 2 2 = 2 = 2 + 2 = YCτ + YSτ and b SS. the least-squares fitting amplitude is A a b 2 2 ZM N CCτ SSτ Consider now two independent Gaussian variables whose (see 6). cumulative distribution function (CDF) is given by: The comparison with SigSpec offers another point of view.   2 β2 It shows how the GLS is associated to Fourier theory and how it − 1 α + Φ(α, β|ω) = e 2 α2 β2 . can be derived from the DFT (discrete Fourier transform) when demanding certain statistical properties such as simple statistical y A Gaussian distribution of the physical variable i in the time behaviour, time-translation invariance (Scargle 1982)andvary- domain will lead to Gaussian variables aZM and bZM which in ing zero mean. The shown equivalences allow vice versa to apply general are correlated. A rotation of Fourier Space by phase θ    0 several of Reegen’s (2007) conclusions to the GLS, e.g. that it is ω − ω ω less susceptible to aliasing or that the time domain sampling is θ = N sin 2 ti 2 cos ti sin ti tan 2 0 2 2 taken into account in the probability distribution. N cos 2ωti − ( cos ωti) + ( sin ωti) transforms aZM and bZM into uncorrelated coefficients α and β with vanishing covariance. Indeed, ωτ from Eq. (19)andθ0 have 5. Application of the GLS to the Keplerian thesamevalue,butτ is applied in the time domain, while θ0 is periodogram (Keplerian fits to radial velocity applied to the phase θ in the Fourier domain. It is only mentioned data) here that the resulting coefficients 2α and 2β correspond to YCτ and YSτ. The search for the best-fitting sine function is a multidimen- 2 Finally, Reegen (2007) defines as the spectral significance sional χ -minimisation problem with four parameters: period P, sig(α, β|ω):= − log Φ(α, β|ω) and writes: amplitude A, phase ϕ and offset c (or frequency ω = 2π/P, a, ⎡  b and c). At a given frequency ω the best-fitting parameters A, ⎢ 2 N log e ⎢ aZM cos θ0 + bZM sin θ0 ϕ and c can be computed immediately by an analytic solution a , b |ω = ⎢ sig( ZM ZM ) 2 ⎣ y α0 revealing the global optimum for this three dimensional param-   ⎤ 2 eter subspace. But involving the frequency leads to a lot of local a cos θ − b sin θ ⎥ χ2 + ZM 0 ZM 0 ⎦⎥ (26) optima (minima in ) as visualised by the numbers of max- β0 ima in the generalised Lomb-Scargle periodogram. With step- ping through frequency ω one can pick up the global optimum in where the four dimensional parameter space. That is how period search α2 with the periodogram works. Because an analytic solution (im- α := 2N 0 y2 plemented in the GLS) can be employed partially, there is no need for stepping through all parameters to explore the whole    2 parameter space for the global optimum. 2 2 = N cos (ωti − θ0) − cos(ωti − θ0) This concept can be transferred to the Keplerian peri- N2 odogram which can be applied to search stellar radial veloc- ity data for periodic signals caused by orbiting companions and β2 β := 2N measured from spectroscopic Doppler shifts. The radial velocity 0 y2 curve becomes more non-sinusoidal for a more eccentric orbit 5    and its shape depends on six orbital elements : 2 2 = N sin2(ωt − θ ) − sin(ωt − θ ) γ N2 i 0 i 0 constant system radial velocity are called normalised semi-major and semi-minor axes. Note K radial velocity amplitude α2 ∼ β2 ∼  longitude of periastron that 0 2CCτ and 0 2SSτ. Reegen (2007) states that this gives as accurate frequencies e eccentricity as do least squares. However, from this derivation it is not clear T periastron passage that this is equivalent. But when comparing Eqs. (26)to(20) 0 . with using YCτ = YC cos ωτ+YS sin ωτ and YSτ = YS cos ωτ− P period YC sin ωτ:  In comparison to the full sine fit there are two more param- 1 (YC cos ωτ + YS sin ωτ)2 eters to deal with which complicates the period search. An ap- p(ω) = YY CCτ proach to simplify the Keplerian orbit search is to use the GLS  (YS cos ωτ − YC sin ωτ)2 5 K = 2π √a sin i with a the semi-major axis of the stellar orbit and i the + P 1−e2 SSτ inclination. M. Zechmeister and M. Kürster: The generalised Lomb-Scargle periodogram 581 to look for a periodic signal and use it for an initial guess. But algorithm is needed for the computation of the Keplerian pe- choosing the best-fitting sine period does not necessarily lead to riodogram which by definition yields the best fit at fixed fre- the best-fitting Keplerian orbit. So for finding the global opti- quency and no local χ2-minima. O’Toole et al. (2007, 2009)de- mum the whole parameter space must be explored. veloped an algorithm called 2DKLS (two dimensional Kepler The Keplerian periodogram (Cumming 2004), just like Lomb-Scargle) that works on grid of period and eccentricity and the GLS, shows how good a trial period (frequency ω) can seems to be similar to ours. But the possibility to use partly an model the observed radial velocity data and can be defined as analytic solution or the need for stepping T0 is not mentioned by (χ2-reduction): these authors. The effort to compute the Keplerian periodogram with the χ2 − χ2 ω 0 Kep( ) described algorithm is much stronger in comparison to the GLS. p (ω) = · Kep χ2 There are three additional loops: two loops for stepping through 0 e and T0 and one for the iteration to solve Kepler’s equation. Instead of the sine function, the function Contrary to the GLS it is not possible to use recurrences or the fast computation of the trigonometric sums mentioned in Sect. 2. = γ +  + ν +  RV(t) K[e cos cos( (t) )] (27) However we would like to outline some possibilities for tech- nical improvements for a faster search. The first concerns the which describes the radial reflex motion of a star due to the grid size. We choose a regular grid for e, T and ω. While this gravitational pull of a planet, serves as the model for the ra- 0 is adequate for the frequency ω, as we discuss later in this sec- dial velocity curve. The time dependence is given by the true tion, there might be a more appropriate e − T grid,e.g.aless anomaly ν which furthermore depends on three orbital parame- 0 dense grid size for T0 at lower eccentricities. A second possibil- ters (ν(t, P, e, T0)). The relation between ν and time t is: ity is to reduce the iterations for solving Kepler’s equation by ν + using the eccentric anomalies (or differentially corrected ones) = 1 e E ff tan − tan as initial values for the next slightly di erent grid point. This 2 1 e 2 can save several ten percent of computation time, in particular in t − T dense grids and at high eccentricities. A third idea which might E − e sin E = M = 2π 0 (28) P have a high potential for speed up is to combine our algorithm with other optimisation techniques (e.g. Levenberg-Marquardt where E and M are called eccentric anomaly and mean anomaly, instead of pure stepping) to optimise the remaining parameters respectively6. Equation (28), called Kepler’s equation, is tran- ω in the function pe,T0 ( ). A raw grid would provide the initial val- scendent meaning that for a given time t the eccentric anomaly E ues and the optimisation technique would do the fine adjustment. cannot be calculated explicitly. To give an example Fig. 1 shows RV data for the M dwarf 2 For the computation of a Keplerian periodogram χ is to be GJ 1046 (Kürster et al. 2008) along with the best-fitting minimised with respect to five parameters at a fixed trial fre- Keplerian orbit (P = 168.8d, e = 0.28). Figure 2 shows the quency ω. Similar to the GLS there is no need for stepping Lomb-Scargle, GLS and Keplerian periodograms. Because a through all parameters. With the substitutions c = γ + Kecos , Keplerian orbit has more degrees of freedom it always has the =  = −  2 a K cos and b K sin Eq. (27) can be written as: highest χ reduction (0 ≤ pLS ≤ pGLS ≤ pKep,e<0.6 ≤ pKep ≤ 1). As a comparison the Keplerian periodogram restricted to RV(t) = c + a cos ν(t) + b sin ν(t) e < 0.6 is also shown in Fig. 2. At intervals where pKep ex- and with respect to the parameters a, b and c the analytic solution ceeds pe<0.6 the contribution is due to very eccentric orbits. Note can be employed as described in Sect. 2 for known ν (instead that the Keplerian periodogram obtains more structure when the of ωt). So for fixed P, e and T0 the true anomalies νi can be search is extended to more eccentric orbits. Therefore the evalu- calculated and the GLS from Eq. (5) (now using νi instead of ωti) ation of the Keplerian periodogram needs a higher frequency res- can be applied to compute the maximum χ2-reduction (p(ω)), olution (this effect can also be observed in O’Toole et al. 2009). ω This is a consequence of the fact that more eccentric orbits are here called pe,T0 ( ). Stepping through e and T0 yields the best Keplerian fit at frequency ω: spikier and thus more sensitive to phase and frequency. Other than O’Toole et al. (2009) we plot the periodograms p (ω) = max p , (ω) 7 δ Kep , e T0 against frequency to illustrate that the typical peak width f is e T0 frequency independent. Thus equidistant frequency steps (d f < as visualised in the Keplerian periodogram. Finally, with step- δ f ) yield a uniform sampling of each peak and are the most eco- ping through the frequency, like for the GLS, one will find the nomic way to compute the periodogram rather than e.g. loga- best-fitting Keplerian orbit having the overall maximum power: rithmic period steps (leading to oversampling at long periods: = 1 = − 1 = − 1 d f d P P2 dP P dlnP)asusedbyO’Toole et al. (2009). p (ω ) = max p (ω). Kep best ω Kep Still the periodograms can be plotted against a logarithmic pe- riod scale as e.g. sometimes preferred to present a period search There exist a series of tools (or are under development) us- for exoplanets. ing genetic algorithms or Bayesian techniques for fast searches Figure 4 visualises local optima in the pe,T0 mapatanarbi- for the best Keplerian fit (Ford & Gregory 2007; Balan & Lahav trary fixed frequency. There are two obvious local optima which 2008). The algorithm, presented in this section, is not further means that searching from only one initial value for T0 may be optimised for speed. But it works well, is easy to implement not sufficient as one could fail to lead the best local optimum and is robust since in principle it cannot miss a peak if the 3 di- in the e − T0 plane. This justifies a stepping through e and T0. ω ffi mensional grid for e, T0 and is su ciently dense. A reliable The complexity in the e − T0 plane, in particular at high eccen- √ tricities, finally translates into the Keplerian periodogram. When 6 ν = 1−e2 sin E The following expressions are also used frequently: sin 1−e cos E ν = cos E−e 7 and cos 1−e cos E . For Fourier transforms this is common. 582 M. Zechmeister and M. Kürster: The generalised Lomb-Scargle periodogram

2000 1 0.6 0.9 0.8 0.5 1000 0.7 0.4 0.6 0 0.5 0.3

0.4 Power p RV [m/s] 0.2 -1000 Eccentricity e 0.3 0.2 0.1 0.1 -2000 0 0 3200 3400 3600 3800 4000 4200 0 90 180 270 360 Periastronpassage T [deg] Barycentric Julian Date BJD - 2 450 000 0

Fig. 4. Power map (pe,T0 )fore and T0 at the arbitrary fixed frequency Fig. 1. The radial velocity (RV) time series of the M dwarf GJ 1046. = . −1 = . = . = . f 0 00422 d . The maximum value p 0 592 is deposited in the The solid line is the best Keplerian orbit fit (P 168 8d,e 0 28). Keplerian periodogram. Note that there are the two local optima.

1 pKep p 0.8 Kep, e<0.6 analogous to Eq. (23) and derived the probability distribution pGLS p LS   − N−3 0.6 N − 3 4z 4z 2 Prob(z > z ) = 1 + 0 1 + 0 . 0 2 N − 5 N − 5

Power p 0.4 With this we can calculate that the higher peak which is much 0.2 closer to 1 has a 10−14 times lower probability to be due to noise, i.e. it has a 1014 times higher significance. In Fig. 3 the peri- 0 χ2 0.002 0.003 0.004 0.005 0.006 0.007 odogram power is plotted on a logarithmic scale for on the χ2 Frequency f [1/d] right-hand side. The much lower is another convincing argu- ment for the much higher significance. Fig. 2. Comparison of the normalized Lomb-Scargle (LS), GLS and Cumming (2004) suggested to estimate the FAP for the pe- = 1 = ω Keplerian periodograms for GJ 1046 ( f P 2π ). riod search analogous to Eq. (25) and the number of independent frequencies again as M ≈ TΔ f . This estimation does not take Period [d] into account the higher variability in the Keplerian periodogram, 500 400 300 200 150 which depends on the examined eccentricity range, and therefore .99999 10 this FAP is likely to be underestimated. pKep Another, more extreme example is the planet around p .9999 Kep, e<0.6 100 HD 20782 discovered by Jones et al. (2006). Figure 5 shows pGLS p the RV data for the star taken from O’Toole et al. (2009). Due .999 LS 1E3 2 to the high eccentricity this is a case where LS and GLS fail to 1E4 χ find the right period. However, our algorithm for the Keplerian

Power p .99 periodogram find the same solution as the 2DKLS (P = 591.9d, 1E5 .9 e = 0.97). The Keplerian periodogram in Fig. 6 indicates this pe- −−−−χ1E6 2 riod. This time it is normalised according to Eq. (29) and seems 0. 0 0.002 0.003 0.004 0.005 0.006 0.007 to suffer from an overall high noise level (caused by many other Frequency f [1/d] eccentric solutions that will fit the one “outlier”). However, that the period is significant can again be shown just as in the previ- Fig. 3. The same periodograms as in Fig. 2 with a quasi-logarithmic ous example. scale for p and a logarithmic scale for χ2 (axis to the right). For comparison we also show the periodogram with the nor- malisation by the best fit at each frequency (Cumming 2004)

− χ2 − χ2(ω) varying the frequency the landscape and maxima will evolve and N 5 0 zKep(ω) = (30) the maxima can also switch. 4 χ2(ω) In the given example LS and GLS would give a good initial guess for the best Keplerian period with only a slight frequency which is used in the 2DKLS and reveals the power maximum as shift. But this is not always the case. impressively as in O’Toole et al. (2009,Fig.1b). As Cumming One may argue, that the second peak has an equal height sug- et al. (1999) mentioned the choice of the normalisation is a mat- gesting the same significance. On a linear scale it seems so. But ter of taste; the distribution of maximum power is the same. Finally, keep in mind when comparing Fig. 6 with the 2DKLS the significance is not a linear function of the power. Cumming = . et al. (2008) normalised the Keplerian periodogram as in O’Toole et al. (2009,Fig.1b) which shows a slice at e 0 97, that the Keplerian periodogram in Fig. 6 includes all eccentric- ities (0 ≤ e ≤ 0.99). Also the algorithms are different because 2 2 N − 5 χ − χ (ω) N − 5 pKep(ω) z (ω) = 0 = (29) we also step for T0 and simultaneously fit the longitude of peri- Kep 4 χ2 4 1 − p (ω ) astron . best Kep best M. Zechmeister and M. Kürster: The generalised Lomb-Scargle periodogram 583

100 sinusoidal functions of the kind: y(t) = aZ(t)cosωt+bZ(t)sinωt with an arbitrary amplitude modulation Z(t) whose time de- 0 pendence and all parameters are fully specified (e.g. Z(t) can be an exponential decay). Without repeating the whole proce- -100 dure given in 6, 6 and Sect. 2 it is just mentioned here that the generalisation to y(t) = aZ(t)cosωt + bZ(t)sinωt + c will ff RV [m/s] result in the same equations with the di erence that Z(ti)has -200 to be attached to each sine and cosine term in each sum (e.g. CCˆ = wiZ(ti)cosωti · Z(ti)cosωti). -300 We presented an algorithm for the application of GLS to the Keplerian periodogram which is the least-squares spectrum for 1000 1500 2000 2500 3000 3500 4000 4500 Keplerian orbits. It is an hybrid algorithm that partially applies Barycentric Julian Date BJD - 2 450 000 an analytic solution for linearised parameters and partially steps through non-linear parameters. This has to be distinguished from Fig. 5. The radial velocity (RV) time series of HD 20782. The solid line methods that use the best sine fit as an initial guess. With two is the best Keplerian orbit fit. examples we have demonstrated that our algorithm for the com- putation of the Keplerian periodogram is capable to detect (very) Period [d] eccentric planets in a systematic and nonrandom way. 5000 1000 500 300 200 150 800 Apart from this, the least-squares spectrum analysis (the idea 700 goes back to Vanícekˇ 1971) with more complicated model func- 600 tions than full sine functions is beyond the scope of this paper 500 (e.g. including linear trends, Walker et al. 1995 or multiple sine 400 functions). For the calculation of such periodograms the employ- 300 ment of the analytical solutions is not essential, but can be faster. Power z 200 100 0 Appendix A 0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 A.1. Derivation of the generalised Lomb-Scargle periodogram Frequency f [1/d] (GLS)

Fig. 6. Keplerian periodogram for HD 20782. Both are the same The derivation of the generalised Lomb-Scargle periodogram is Keplerian periodogram, but the upper one is normalized with the best briefly shown. With the sinusoid plus constant model fit (Eq. (29)) while the lower one is normalized with the best fit at each y = ω + ω + frequency (Eq. (30)). Both have by definition the same maximum value. (t) a cos t b sin t c

the squared difference between the data yi and the model func- 6. Conclusions tion y(t)  2 2 Generalised Lomb-Scargle periodogram (GLS), floating-mean χ = W wi[yi − y(ti)] periodogram (Cumming et al. 1999), date-compensated discrete Fourier transform (DCDFT, Ferraz-Mello 1981), and “spectral is to be minimised. For the minimum χ2 the partial derivatives significance” (SigSpec, Reegen 2007) at last all mean the same vanish and therefore: thing: least-squares spectrum for fitting a sinusoid plus a con-  0 = ∂ χ2 = 2W w [y − y(t )] cos ωt (A.1) stant. Cumming et al. (1999)andReegen (2007) have already a  i i i i shown the advantages of accounting for a varying zero point and 0 = ∂ χ2 = 2W w [y − y(t )] sin ωt (A.2) therefore we recommend the usage of the generalised Lomb- b  i i i i Scargle periodogram (GLS) for the period analysis of time se- 2 0 = ∂cχ = 2W wi[yi − y(ti)]. (A.3) ries. The implementation is easy as there are only a few modifi- cations in the sums of the Lomb-Scargle periodogram. These conditions for the minimum give three linear equations: The GLS can be calculated as conveniently as the Lomb- ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ Scargle periodogram and in a straight forward manner with an ⎢ YCˆ ⎥ ⎢ CCˆ CSˆ C ⎥ ⎢ a ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ analytical solution while programs applying standard routines ⎣⎢ YSˆ ⎦⎥ = ⎣⎢ CSˆ SSˆ S ⎦⎥ ⎣⎢ b ⎦⎥ for fitting sinusoids involve solving a set of linear equations by Y CS1 c inverting a 3 × 3 matrix repeated at each frequency. The GLS can be tailored by concentrating the sums in one loop over the where the abbreviations in Eqs. (7)−(15) were applied. data. As already mentioned by Lomb (1976)Eq.(5) (including Eliminating c in the first two equations with the last equation Eqs. (13)−(18)) should be applied for the numerical work. A fast (c = Y − aC − bS ) yields:      calculation of the GLS is especially desirable for large samples, ˆ − · ˆ − · ˆ − · largedatasetsand/or many frequency steps. It also may be help- YC Y C = CC C C CS C S a . ˆ − · ˆ − · ˆ − · ful to speed up prewhitening procedures (e.g. Reegen 2007)in YS Y S CS C S SS S S b case of multifrequency analysis or numerical calculations of the Using again the notations of Eqs. (11)−(15) this can be writ- significance of a signal with bootstrap methods or Monte Carlo ten as: simulations.      The term generalised Lomb-Scargle periodogram has al- YC = CC CS a . ready been used by Bretthorst (2001) for the generalisation to YS CS SS b 584 M. Zechmeister and M. Kürster: The generalised Lomb-Scargle periodogram

So the solution for the parameters a and b is Expanding the last term yields

YC · SS − YS · CS YS · CC − YC · CS 2CS τ = 2CSˆ cos 2ϕ − (CCˆ − SSˆ )sin2ϕ a = and b = · (A.4)   D D −2 C · S (cos2 ϕ − sin2 ϕ) − (C2 − S 2)sinϕ cos ϕ The amplitude of the best-fitting sine function at frequency ω is √ ϕ ϕ 2 + 2 χ2 and after factoring cos 2 and sin 2 : given by a b . With these solutions the minimum can be written only in terms of the sums Eqs. (10)−(15) when eliminat- 2 2 2CS τ = 2(CSˆ − C · S )cos2ϕ − CCˆ − SSˆ − (C − S ) sin 2ϕ ing the parameters a, b,andc as shown below. With the condi- tions for the minimum Eqs. (A.1)−(A.3) it can be seen that: = 2CS cos 2ϕ − (CC − SS)sin2ϕ·   = ϕ = ωτ w [y − y(t )]y(t ) = a w [y − y(t )] cos ωt So for CS τ 0, has to be chosen as: i i i i  i i i i 2CS +b wi[yi − y(ti)] sin ωti tan 2ωτ = ·  CC − SS +c w [y − y(t )] i i i By the way, replacing the generalised sums CC, SS and CS by = 0. the classical ones leads to the original definition forτ ˆ in Eq. (3):  χ2 Therefore, the minimum can be written as: 2CSˆ sin 2ωti   tan 2ωτˆ = =  · CCˆ − SSˆ cos 2ωti χ2(ω)/W = w [y − y(t )]y − w [y − y(t )]y(t ) i i i i i i i i =0 = YYˆ − aYCˆ − bYSˆ − cY = YYˆ − Y · Y − a(YCˆ − Y · C) − b(YSˆ − Y · S ) References = YY − aYC − bYS Balan, S. T., & Lahav, O. 2008, ArXiv e-prints, 805 Baluev, R. V. 2008, MNRAS, 385, 1279 where in the last step again the definitions of Eqs. (10)−(12) Barning, F. J. M. 1963, Bull. Astron. Inst. Netherlands, 17, 22 were applied. Finally a and b can be substituted by Eq. (A.4): Bretthorst, G. L. 2001, in Bayesian Inference and Maximum Entropy Methods in Science and Engineering, ed. A. Mohammad-Djafari, Am. Inst. Phys. Conf. SS · YC2 CC · YS 2 CS · YC · YS Ser., 568, 241 χ2(ω)/W = YY − − + 2 . Cumming, A. 2004, MNRAS, 354, 1165 D D D Cumming, A., Marcy, G. W., & Butler, R. P. 1999, ApJ, 526, 890 2 Cumming, A., Butler, R. P., Marcy, G. W., et al. 2008, PASP, 120, 531 When now using the χ -reduction normalised to unity: Ferraz-Mello, S. 1981, AJ, 86, 619 Ford, E. B., & Gregory, P. C. 2007, in Statistical Challenges in Modern χ2 − χ2(ω) Astronomy IV, ed. G. J. Babu, & E. D. Feigelson, ASP Conf. Ser., 371, 189 p(ω) = 0 χ2 Gilliland, R. L., & Baliunas, S. L. 1987, ApJ, 314, 766 0 Horne, J. H., & Baliunas, S. L. 1986, ApJ, 302, 757 Irwin, A. W., Campbell, B., Morbey, C. L., Walker, G. A. H., & Yang, S. 1989, χ2 = · and the fact that 0 W YY,Eq.(5) will result. PASP, 101, 147 Jones, H. R. A., Butler, R. P., Tinney, C. G., et al. 2006, MNRAS, 369, 249 Koen, C. 1990, ApJ, 348, 700 A.2. Verification of Eq. (19) Kürster, M., Endl, M., & Reffert, S. 2008, A&A, 483, 869 Lomb, N. R. 1976, Ap&SS, 39, 447 Equation (19) can be verified with the help of trigonometric O’Toole, S. J., Butler, R. P., Tinney, C. G., et al. 2007, ApJ, 660, 1636 addition theorems. For this purpose CS must be formulated. O’Toole, S. J., Tinney, C. G., Jones, H. R. A., et al. 2009, MNRAS, 392, 641 Furthermore, the index τ and the notation ϕ = ωτ will be used: Press, W. H., & Rybicki, G. B. 1989, ApJ, 338, 277  Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Numerical recipes in FORTRAN. The art of scientific computing 2CS τ = wi sin 2(ωti − ϕ) (Cambridge: University Press), 2nd edn.   Reegen, P. 2007, A&A, 467, 1353 −2 wi cos(ωti − ϕ) wi sin(ωti − ϕ) Scargle, J. D. 1982, ApJ, 263, 835   Shrager, R. I. 2001, Ap&SS, 277, 519 = cos 2ϕ wi sin 2ωti − sin 2ϕ wi cos 2ωti Vanícek,ˇ P. 1971, Ap&SS, 12, 10 Walker, G. A. H., Walker, A. R., Irwin, A. W., et al. 1995, Icarus, 116, 359 −2(C cos ϕ + S sin ϕ)(S cos ϕ − C sin ϕ).