<<

Koninklijk Meteorologisch Instituut van Belgi¨e Institut Royal M´et´eorologique de Belgique

Deconvolution of the Fourier spectrum

F. De Meyer

2003

Wetenschappelijke en Publication scientifique technische publicatie et technique Nr 33 No 33

Uitgegeven door het Edit´epar KONINKLIJK METEOROLOGISCH l’INSTITUT ROYAL INSTITUUT VAN BELGIE METEOROLOGIQUE DE BELGIQUE Ringlaan 3, B -1180 Brussel Avenue Circulaire 3, B -1180 Bruxelles Verantwoordelijke uitgever: Dr. H. Malcorps Editeur responsable: Dr. H. Malcorps

Koninklijk Meteorologisch Instituut van Belgi¨e Institut Royal M´et´eorologique de Belgique

Deconvolution of the Fourier spectrum

F. De Meyer

2003

Wetenschappelijke en Publication scientifique technische publicatie et technique Nr 33 No 33

Uitgegeven door het Edit´epar KONINKLIJK METEOROLOGISCH l’INSTITUT ROYAL INSTITUUT VAN BELGIE METEOROLOGIQUE DE BELGIQUE Ringlaan 3, B -1180 Brussel Avenue Circulaire 3, B -1180 Bruxelles Verantwoordelijke uitgever: Dr. H. Malcorps Editeur responsable: Dr. H. Malcorps

1

Abstract

The problem of estimating the frequency spectrum of a real continuous function, which is measured only at a finite number of discrete times, is discussed in this tutorial review. Based on the deconvolu- tion method, the complex version of the one-dimensional CLEAN provides a simple way to remove the artefacts introduced by the sampling and the effects of missing data from the computed Fourier spectrum. The technique is very appropriate in the case of equally spaced data, as well as for data samples randomly distributed in time. An example of the application of CLEAN to a synthetic spectrum of a small number of harmonic components at discrete frequencies is shown. The case of a low signal-to- time sequence is also presented by illustrating how CLEAN recovers the spectrum of the declination component of the geomagnetic field, measured in the magnetic observatory of Dourbes.

1. Introduction

Synthesis maps in are generated by Fourier transformation of interferome- ter visibility data to produce a map of the sky (Thompson et al., 1986). Because only a finite number of visibility points are gathered, the problems of sampling apply to interferometer maps. The CLEAN algorithm is widely used in two-dimensional image reconstruction and performs an approximate deconvolution of the ‘true map’ of the sky from the ‘dirty map’ procreated from the data, in essence, removing the false features introduced inherently by the finite sampling. By adapting the two-dimensional CLEAN procedure developed for use in aperture synthesis, Roberts et al. (1987) discussed a rather intuitive way of handling the difficulties associated with the spurious apparent responses that arise from the incompleteness of the discrete sampling of a continuous time function. A multitude of spectral estimation for discrete have been proposed (Kay and Marple, 1981). Comparisons among various competing techniques have been based on limited computer simulations, which can be misleading. Here we select the conventional Fourier spectral estimator because the effects of finite, discrete sampling on the performance of the classical periodogram are precisely known and can be almost completely rectified. Estimation of the power , or simply the spectrum, of discretely sampled deterministic and stochastic processes is usually based on procedures employing the Fast or FFT (Brigham, 1974), which implicitly assumes that the data are periodic outside the observ- ing interval. Unfortunately the FFT technique is directly applicable only when the input data sequence is evenly spaced in time. Often observations cannot be controlled to the extent that observability constraints may cause an unequally spaced sample domain. One is then forced to extract spectral information from these irregularly sampled data. The FFT approach to spectrum analysis is computationally efficient and produces reason- able outcomes for a large class of signal processes (Briggs and Henson, 1995). In spite of these advantages, there are several inherent performance limitations of the FFT procedure. The most prominent shortcoming is that of frequency resolution, i.e., the ability to distinguish the spec- tral responses of distinct signals. The frequency resolution is roughly the reciprocal of the time interval over which sampled data are available. A second imperfection is due to the implicit windowing of the data that occurs when processing with the FFT. Windowing manifests itself as ‘leakage’ in the spectral domain, that is, energy in the main lobe of a spectral response ‘leaks’ into the sidelobes, obscuring and distorting other spectral responses that are present. In fact, weak signal spectral responses can be masked by higher sidelobes from stronger spectral com- ponents. These two performance limitations of the FFT approach are particularly troublesome when analyzing short data records. 2

In an attempt to alleviate the inherent shortcomings of the FFT approach, many alternative spectral estimation procedures have been proposed. For instance, Kay and Marple (1981) discuss the improvement that may result from non-traditional methods such as the autoregressive (AR) method or the maximum entropy method (MEM). Claims have been made concerning the de- gree of improvement obtained in the spectral resolution and the signal detectability when these numerical techniques are applied to numerical data. These performance advantages, though, strongly depend upon the signal-to-noise ratio (SNR) of the input data, as might be expected. In fact, for low enough SNR’s the ‘modern’ spectral estimators are often no better than those obtained with conventional FFT processing. Even in those cases where improved spectral fidelity is achieved by use of a different spectral estimation procedure, the computational requirements of that alternate method may be substantially higher than FFT computation. This may make some spectral estimators unattractive for real-time implementation. Schwarz (1978) has shown that CLEAN is a statistically correct method of least squares fit- ting sinusoidal functions to the observations which is the conventional framework of the discrete Fourier transform. Fourier inversion of a finite representation of a continuous function leads to a frequency spectrum distorted by (1) the limited frequency resolution due to the finite time span of the data sample, and (2) spurious apparent responses that are caused by the incompleteness of the sampling. There is an intuitive way of handling the difficulties associated with the latter effect. With a view to practical applications heuristic arguments are used in this discussion to make the discourse more easily accessible. The CLEAN algorithm performs, in essence, a non-linear deconvolution of the Fourier spectrum on the frequency axis, equivalent to a least squares interpolation in the time domain. The numerical method is particularly suitable for functions whose spectra are dominated by a few harmonic components at discrete frequencies. The CLEAN technique for time series spectral analysis (Roberts et al., 1987) works by subtracting from the noisy or ‘dirty’ spectrum, which is the of the ‘true’ spectrum with the ‘dirty beam’ generated by the finite number of data, the response expected from a sinusoidal component with frequency corresponding to the maximum of the spectrum. In this way a residual spectrum is produced from which both the component and the spurious features due to sampling have been removed. This procedure of ‘CLEANing’ the spectrum is repeated on successive residual spectra until nothing is left but noise. The set of resulting ‘clean com- ponents’ is then used as a model for the time function. This model is convolved with a ‘clean beam’ which has the same resolution as the original ‘spectral window’ but no sidelobes. The convolution of the clean components model with the clean beam serves to weight down the interpolated information at time instants outside those sampled. Finally, to preserve the noise level of the spectrum and to incorporate any components which were not resolved by CLEAN, the final residual spectrum is added to the convolved clean components to produce the ‘clean spectrum’. The one-dimensional version of the CLEAN deconvolution technique is especially useful for spectral analysis of unequally spaced observations and provides a simple way to dispose of the artefacts of missing data, a situation which is frequently encountered in practical time series. As a first example, the application of CLEAN will be illustrated by means of the frequency analysis of a synthetic spectrum, that is generated by a small number of harmonic components at discrete frequencies. The method is also used in the search for periodic variability in the time series of the hourly values of the magnetic declination, measured in the observatory of Dourbes, Belgium, which is characterized by a spectrum having a low signal-to-noise ratio. 3

2. The Fourier spectrum of a continuous function

The basic recourse for the frequency analysis of an aperiodic or transient function is the Fourier integral (Bracewell, 1986). Here we follow the notation of Jenkins and Watts (1968). The spec- trum of an analog function x(t), known for all time t, is given by the continuous Fourier transform (denoted by the operator F)ofx(t), ∞ X(f) ≡F{x(t)} = x(t) e−2πift dt, −∞≤f ≤∞, (1) −∞ which defines the contribution of each frequency f to x(t). The inverse Fourier transform (de- noted by F −1) is defined by the Riemann integral ∞ x(t) ≡F−1{X(f)} = X(f) e2πift df , −∞≤t ≤∞. (2) −∞

Note that the Fourier operator F symbolises a linear transformation of the time axis onto the frequency axis and vice versa. If we assume here, as elsewhere, x(t) to be real, X(f) is in general a complex function of frequency and the Fourier transform satisfies the symmetry relationship

X(−f)=X∗(f), (3) where the ‘∗’ represents complex conjugation. A rigourous demonstration shows that the validity of the integral (1) requires that x(t) should be absolutely integrable: ∞ |x(t)| dt < ∞. (4) −∞

Suppose that x(t) is a zero-mean continuous time function. Parseval’s theorem for Fourier transforms, expressed as ∞ ∞ [x(t)]2 dt = |X(f)|2 df , (5) −∞ −∞ is a statement of the conservation of energy: the energy of the time domain signal is equal to the energy of the transform. Note that the assumption of finite signal energy is a sufficient, but not a necessary condition for the existence of the Fourier transform (1). This leads one to define the power spectrum P (f)ofx(t) as the squared modulus of its Fourier transform: P (f)=|X(f)|2 , −∞≤f ≤∞. (6) Thus, P (f) is an energy spectral density in that it represents the distribution of energy as a function of frequency. Since x(t) is supposed to be real, half of the power at a given |f| occurs at the negative frequency −f. Writing P (f)=X(f)X∗(f) it is easily shown from equation (1) that ∞ P (f)= γ(τ) e−2πifτ dτ, (7) −∞ defining the power spectrum as the Fourier transform of the autocorrelation function γ(τ)of x(t): ∞ γ(τ)= x(t) x(t + τ) dt, −∞≤τ ≤∞. (8) −∞ 4

The Cauchy-Schwarz inequality for integrals states that ∞ 2 ∞ ∞ ∞ 2 x(t) x(t + τ) dt ≤ [x(t)]2 dt . [x(t + τ)]2 dt = [x(t)]2 dt . (9) −∞ −∞ −∞ −∞

Therefore the autocorrelation function is maximum at the origin:

|γ(τ)|≤γ(0) for all τ. (10)

Using the Parseval theorem (5) it is clear that ∞ ∞ γ(0) = [x(t)]2 dt = P (f) df . (11) −∞ −∞

It follows that the autocorrelation function of x(t) exists if the total energy γ(0) of x(t) is finite. For this reason the Fourier transform will only be applied to square integrable (deterministic or transient) signals with finite energy. This implies that x(t) must approach zero for large values of t to prevent the total energy to become infinite. A different viewpoint must be taken when x(t) is a wide sense stationary, stochastic pro- cess rather than a deterministic, finite-energy waveform (Papoulis, 1965). The energy of such processes is usually infinite, so that the quantity of interest is the time average of energy distri- bution with frequency. Also, integrals such as (1) normally do not exist for stochastic processes. For the case of stationary random processes, the autocorrelation function provides the basis for spectrum analysis, rather than the random process itself (Blackman and Tukey, 1959). An ad- ditional assumption often made is that the stochastic process is ergodic in the first and second moments. This property permits the substitution of time averages for ensemble averages. Here we will deal only with deterministic signals since we want to keep the discussion as simple as possible without aggravating unnecessarily the mathematical representations. A concept of great importance both in theoretical work and physical application is the convolution integral ∞ x(t) ⊗ y(t)= x(t ) y(t − t ) dt, (12) −∞ where x(t)andy(t) are transient functions with Fourier transforms X(f)andY (f), respectively. By a change of variable it is easy to see that x(t) ⊗ y(t)=y(t) ⊗ x(t) whenever the associated integrals converge. Then, the for Fourier transforms states that the Fourier transform of the convolution x(t) ⊗ y(t) is the product of the Fourier transforms of x(t)and y(t). In operation notation, the convolution theorem is expressed as follows:

F{x(t) ⊗ y(t)} = F{x(t)}. F{y(t)} = X(f) .Y(f). (13)

Also note that x(t) ⊗ y(t)=F −1{X(f) .Y(f)}, (14) which means that the convolution of two functions and the product of their Fourier transforms are Fourier pairs. The principal utility of Fourier theory is that certain operations that are difficult on the time axis become very simple in terms of frequency. For instance, in the frequency domain the convolution of x(t)andy(t) reduces to a multiplication of the Fourier transforms of the two original functions. The convolved response is then obtained as the inverse Fourier transform of this product. The simplicity and convenience of this result is the main reason for the extensive use of Fourier techniques in geophysics. Suppose that we have measured a finite portion of a continuous trace which is subjected to a sampling procedure with constant time interval ∆t. The impression might arise that greater 5 detail in the Fourier spectrum may be expected by processing more data in a fixed interval of finite length T , thus by letting ∆t → 0, T = constant. Fortunately this is not true so that we are not forced to refine constantly our measurements without end in order to obtain stable spectral estimates. There is indeed a duality between time, whose dimension is [t], and frequency, whose dimension is 1/[t]. A precise determination of a signal in frequency requires a long time lapse for the time function, i.e., T →∞, and conversely, a precise determination in time demands a broad frequency band. This well-known fact is now demonstrated formally. We assume that only a finite portion of a real, zero-mean, continuous record x(t) is known in the interval [0,T] and take x(t) to be definitely zero outside this interval. It is further supposed that x(t) is square integrable with variance

T 2 2 σx = [x(t)] dt. (15) 0 From the Parseval theorem it immediately follows that the Fourier transform X(f)ofx(t)is also square integrable with variance ∞ 2 2 σX = |X(f)| df . (16) −∞

As a result, σX = σx. As a measure of the ‘spread’ of the function x(t) on the time axis we introduce the quantity 1 T 2 ∆x = t2 [x(t)]2 dt , (17) 0 assumed to be finite. A corresponding measure of the dispersion of X(f) on the frequency axis is given by 1 ∞ 2 ∆X = f 2 |X(f)|2 df , (18) −∞ also supposed to be finite. Consider the non-negative definite expression

T I(λ)= [tx(t)+λ x˙(t)]2 dt 0 T T T = t2 [x(t)]2 dt +2λ tx(t)˙x(t) dt + λ2 [˙x(t)]2 dt, (19) 0 0 0 where λ is a real variable not depending on time andx ˙(t) is the shorthand symbol for the time derivative of x(t). Since the Fourier transform ofx ˙(t)is2πifX(f) it follows from the Parseval theorem that T ∞ [˙x(t)]2 dt =4π2 f 2 |X(f)|2 df =4π2(∆X)2. (20) 0 −∞ The second integral on the right-hand side of equation (19) is integrated by parts to give

T − 1 2 tx(t)˙x(t) dt = 2 σx. (21) 0 Combining these results it follows that

2 2 2 2 2 I(λ)=(∆x) − σx λ +4π (∆X) λ . (22) 6

4 2 2 Since the quadratic form I(λ)inλ is positive definite (or zero), its discriminant σx −16π (∆x) . (∆X)2 is necessarily negative or zero. Hence, ∆x ∆X 1 . ≥ , (23) σx σX 4π since σx = σX . This result can be made independent of the record length as long as the integrals in equations (15) and (17) exist in the limit as T →∞. The physical implication of this result is that the spread of the function x(t) on the time axis and the dispersion of its Fourier transform X(f) in frequency are inversely related. As for the practical determination of the Fourier spectrum, this ‘uncertainty principle’, which is essentially of the form of the famous Heisenberg uncertainty relations in quantum mechanics, indicates that a desire to determine the spectrum at a frequency f by decreasing the band of frequencies ∆f about f can only be accomplished by a corresponding increase of the record length T and not by reducing the sampling rate ∆t in the observing interval [0,T] alone. Al- ternatively, for fixed record length T , too small a value of the elementary frequency band ∆f will result in large fluctuations in the computed values of the Fourier spectrum.

3. The consequences of sampling

3.1 The dirty spectrum

In most observational settings only the values of x(t) are available at a finite number of sample points tk,k =0, 1,...,N − 1,

xk = x(tk),k=0, 1,...,N − 1. (24)

These N data may be regarded as having been sampled from x(t) by multiplication with a sampling function w(t), consisting of a weighted sum of Dirac delta functions δ(t),

N−1 w(t)= wk δ(t − tk) (25) k=0

(Brigham, 1974), where the kth data point is assigned weight wk. Thus the sampled signal is

N−1 xˆ(t)=w(t) x(t)= wk xk δ(t − tk). (26) k=0

The sequence (w0,w1,...,wN−1) is called a ‘data window’ and the weights {wk} may evidently be taken equal to 1. It follows from the Fourier convolution theorem that the Fourier transform of the sampled signal, Xˆ(f)=F{xˆ(t)}, is the convolution of the true spectrum X(f)=F{x(t)} with the Fourier transform W(f)=F{w(t)} of the time w(t): ∞ Xˆ(f)=W(f) ⊗ X(f)= W(f − f  ) X(f  ) df . (27) −∞

The observable functions Xˆ(f)andW(f) are called the ‘dirty spectrum’ and the ‘spectral window’ (or ‘dirty beam’), respectively. In the general case of unequally spaced sampling they 7 are given by the discrete Fourier transforms

N−1 −2πiftk Xˆ(f)= wk xk e (28) k=0 and N−1 −2πiftk W(f)= wk e . (29) k=0 This immediately follows from substituting equation (26) into the definition of the Fourier transform in (1). Using the discrete Fourier transform (DFT) formulation of equation (28) is equivalent to approximating the continuous Fourier integral by a numerical integration scheme. Note that we have explicitly included a set of quadrature weights, wk, to emphasize that equation (28) is a discrete estimation of the continuous Fourier transform of equation (1). Use of uniform weights of wk =1/N, ∀k, in the case of equispaced data corresponds to the rectangular rule of quadratic approximation. Duyndam and Schonewille (1999) show that the DFT, defined on an arbitrary set of points tk, can be calculated rapidly on a regular lattice of frequency values. A comprehensive review of different DFT algorithms can be found in Ware (1998). A fast DFT can be used to develop a fast discrete inversion transformation (Dutt and Rohklin, 1995), which is known under the name of ‘Fast Fourier Transform’ of FFT (Cooley and Tukey, 1965). The symmetries of Xˆ(f)andW(f) are the same as that of X(f):

Xˆ(−f)=Xˆ ∗(f), (30) W(−f)=W ∗(f). (31)

The zero-frequency component (DC) of Xˆ(f)isN times the average value of the weighted data wk xk, N−1 Xˆ(0) = wk xk = N (32) k=0 and the spectral window W(f) is equal to the sum of the weights wk at the origin f =0,

N−1 W(0) = wk = N. (33) k=0

Using the weights wk =1,∀k,wehavethatW(0) = N, whereas use of the weights wk =1/N gives the normalization constant W(0) = 1. We now show that the Dirac distribution δ(t) appears in a very natural way when we consider the case that the continuous function x(t) is only available in the finite interval [0,T]. Thus, we actually have access to the truncated version of x(t),

xo(t)=do(t) x(t), (34) where do(t) is the so-called box-car or blanking function

do(t)=1, 0 ≤ t ≤ T, =0, |t| >T, (35) with Fourier transform T −2πift sin πfT −πifT Do(f)= e dt = e (36) 0 πf 8

for a continuous data set of finite length T . Note that Do(0) = T . In accordance with equa- tion (27) it follows that the Fourier transform Xo(f) of the finite portion xo(t)ofx(t) is the convolution of the true spectrum X(f) and the Fourier transform Do(f) of the box-car do(t), ∞    Xo(f)=Do(f) ⊗ X(f)= Do(f − f ) X(f ) df (37) −∞ and the spectral window W(f) in (27) is nothing but Do(f) in this case.

Figure 1: Diffraction function sinc (f).

Equation (36) may be written as −πifT Do(f)=T sinc (fT) e , (38) by means of the diffraction function for continuous data, sin πf sinc (f)= , (39) πf which is illustrated in Figure 1. The function sinc (fT) has a maximum value of 1 at f =0, zeros at the frequencies f = k/T, k = ±1, ±2,..., and oscillations of decreasing amplitude as fT →∞, which are called the ‘sidelobes’ of the spectral window. As T →∞the main lobe of sinc (fT) becomes narrower and the diffraction function tends to be small everywhere, except at f = 0 where it is equal to 1. The maxima of sinc (fT) occur at the frequencies which are solutions of the equation tan (πfT)=πfT. Note that the first sidelobe of sinc (fT)isabout 22% of the height of the main lobe and negative. The effect of these sidelobes can cause serious distortion in the Fourier spectrum Xo(f) of the observed part xo(t)ofx(t). We obtain the definition of the Dirac generalized δ-function in terms of the diffraction function sin πfT δ(f) = lim [T sinc (fT)] = lim , (40) T →∞ T →∞ πf which has the properties that

δ(f)=0,f=0 , = ∞,f=0, (41) 9

and ∞ δ(f) df =1. (42) −∞ Equation (40) is formally written as T/2 ∞ δ(f) = lim e−2πift dt ≡ e−2πift dt, (43) T →∞ −T/2 −∞ which shows that the inverse Fourier transform of the distribution function δ(f) is the function d(t)=1, ∞

X(f) = lim X (f). (45) T →∞ o

The Fourier transform Xo(f) of the finite portion xo(t)ofx(t) is computed from ˆ T T A iϕˆ −2πi(f−fˆ)t −iϕˆ −2πi(f+fˆ)t Xo(f)= e e dt + e e dt 2 0 0 Aˆ = D (f − fˆ)eiϕˆ + D (f + fˆ)e−iϕˆ , (46) 2 o o with Do(f) given in equation (36). In the limit as T →∞it is found that the exact spectrum Aˆ X(f)= δ(f − fˆ) eiϕˆ + δ(f + fˆ) e−iϕˆ , (47) 2 of a real harmonic component is a pure line spectrum, consisting of a pair of infinitely sharp peaks at the frequencies ±fˆ with equal amplitudes A/ˆ 2. The phases of the spectral components at the positive and negative frequencies are ±ϕˆ, respectively. Therefore, contributions at both positive and negative frequencies must be taken into account when studying the periodic be- haviour of a data set. The validity of the Fourier transform formalism depends critically on the question whether the time function x(t) is absolutely integrable. This limitation is a very severe one because many frequently used functions, such as tk with k>0, et,logt, do not have a Fourier transform. Even the trigonometric functions sin ωt and cos ωt do not have a Fourier spectrum in the strict sense. At first sight this seems to be a tremendous restriction on the Fourier transform theory, but the spectrum of harmonic functions may be defined in terms of the Dirac δ-distribution function. Although this is not a problem from the mathematical point of view in the theory of distribu- tions, one may be disturbed over the fact that we use a ‘generalized function’ δ(t) which vanishes everywhere, except at t = 0, with integral value of 1. Nevertheless, it must be admitted that a periodic time function in the interval −∞

According to equation (47), the true Fourier transform of a superposition of harmonic func- tions may be interpreted as a set of isolated peaks of infinite amplitude at discrete frequencies. A definite shortcoming of a finite data set is that the theoretical frequency spectrum is degen- erated to a sum of Do(f) terms with peaks of finite amplitude centred at the frequencies of the individual constituents, which are surrounded by small peaks of decreasing amplitude due to the sidelobes of the Do(f) function. This smudging effect of the frequency window Do(f) introduces a certain vagueness in the computed Fourier spectrum obtained from a finite sample.

3.2 The sampling theorem

We now discuss the consequences of the sampling of a continuous time function. Suppose that the Fourier transform X(f)ofx(t) is a band-limited function of frequency, which means that X(f) is definitely zero outside a band of frequencies [−fN ,fN ]. The inverse Fourier transform of X(f) defines the time function by fN x(t)= X(f) e2πift df , −∞≤t ≤∞. (48) −fN

Let the time variable t take only the discrete values tk = k/(2fN ), k =0, ±1, ±2,.... This corresponds to a sampling of x(t) at discrete points spaced ∆t =1/(2fN ) apart on the time axis: fN x(k∆t)= X(f) e2πikf∆t df , k =0, ±1, ±2,.... (49) −fN

It will be shown that the sampling rate ∆t =1/(2fN ) is the minimum time step allowed for the sampling of the band-limited function x(t), provided that X(f) is zero outside the band of frequencies [−fN ,fN ]. If X˜(f) is defined as the periodic extension of X(f) outside the frequency band [−fN ,fN ], then it has the Fourier series expansion ∞ −2πikf∆t X˜(f)= x˜k e (50) k=−∞ with Fourier coefficients fN 2πikf∆t x˜k =∆t X(f) e df =∆tx(k∆t) (51) −fN

(Bloomfield, 1976). Note that X˜(f)=X(f) in the frequency band [−fN ,fN ]. Such repetitions of X˜(f) outside the interval [−fN ,fN ] are called the ‘ghosts’ of X(f). The coefficientsx ˜k of the Fourier series of X˜(f) are consequently provided by the samples x(k∆t)=xk. Hence, knowledge of the band-limited function x(t) at the sample points tk = k∆t, k =0, ±1, ±2,..., determines X˜(f) for all frequencies and subsequently all of the x(t) itself. From equations (50) and (51) it follows that ∞ −2πikf∆t X˜(f)=∆t xk e . (52) k=−∞

Using equation (48), with X˜(f) replacing X(f) since both are identical in value in the interval [−fN ,fN ] as long as all frequency components of x(t) are in this interval, we finally obtain 11

∞ fN 2πif(t−k∆t) x(t)=∆t xk e df −f k=−∞ N ∞ sin 2πfN (t − k∆t) = xk 2πfN (t − k∆t) k=−∞ ∞ = xk sinc [2fN (t − k∆t)] . (53) k=−∞ Equation (53) demonstrates explicitly how the continuous time function x(t) is reconstructed from its samples xk, taken a distance ∆t =1/(2fN ) apart on the time axis if its spectrum is limited to the so-called Nyquist frequency band [−fN ,fN ]. Note that the ordinate xk at the sample point tk = k∆t is multiplied with a sin θ/θ function, which is centred at the sample point, where it has the value 1, and which is zero at all other sample points. In the case of equidistant data, this relationship of the samples {xk} to the continuous time function x(t) is known as the sampling theorem. Any function whose Fourier transform is zero for |f|≥fN is completely specified by values spaced at equal intervals not exceeding ∆t =1/(2fN ), the sequence extending throughout the time axis. Thus the Nyquist frequency 1 fN = (54) 2∆t is the highest frequency which may be recovered from samples taken at intervals ∆t. In this way it is possible to reconstruct, over the continuous domain of the time variable t, any function whose spectrum is zero outside the interval [−fN ,fN ]. Expression (52) will be identical in value to the transform X(f)ofx(t) over the Nyquist band, as long as x(t) is band-limited and all frequencies are in this interval. Thus the continuous energy spectral density |X˜(f)|2 for data sampled from a band-limited function is identical to the power spectrum in equation (6). The sampling theorem assures that a band-limited signal can be completely reproduced over the continuous domain of the time variable if the sampling interval is properly chosen. This does not imply that the FFT, for example, will yield a discrete power spectrum which equals the spectrum of the continuous signal when restricted to the Nyquist band. Frequency leakage or aliasing due to a finite data length or undersampling will always distort the Fast Fourier Transform output. Moreover, it is worth noting that, strictly speaking, the sampling theorem never applies to an experimental or observational setting since the theorem implicitly requires that the data samples be available over all times. No finite set of data {xk} is enough to specify uniquely the underlying continuous function x(t). From the result in equation (52) it is seen that we can compute the exact Fourier spectrum X(f)ofx(t) from its samples {xk} by the infinite Fourier series ∞ −2πikf∆t X(f)=∆t xk e , (55) k=−∞ provided that the sampling interval ∆t is chosen such that X(f) is definitely zero outside the Nyquist band [−fN ,fN ], with fN defined in equation (54). Then, the spectral properties of the continuous signal are contained in the discrete Fourier transform. Nevertheless, the practical situation always implies that the number of samples is finite, so that the truncation of the infinite series in (55) to a series of finite length must be investigated. In fact, suppose that in the observing interval [0,T] only the samples {xk} are available at the N sample points tk = k∆t, k =0, 1,...,N − 1, so that we actually work with the finite Fourier series N−1 −2πikf∆t Xˆ(f)=∆t xk e . (56) k=0 12

Equation (27) shows that the dirty spectrum Xˆ(f) is the convolution of the true spectrum X(f) with the spectral window W(f), which now becomes

N− 1 sin Nπf∆t W(f)= e−2πikf∆t = e−πi(N−1)f∆t. (57) sin πf∆t k=0 Here we have used the power series expansion for a general complex number z ∞ 1 zk = , (58) 1 − z k=0 from which it is easily verified that

N− 1 1 − zN zk = . (59) 1 − z k=0 The result in equation (57) follows from this expression when substituting z = exp(−2πif∆t). The dirty spectrum is therefore an average-over-frequency of the true spectrum over the frequency band [−fN ,fN ], where X(f) is not identically zero, weighted by the spectral window in equation (57). Defining the diffraction function for digitized data, sin Nπf sincN (f)= , (60) sin πf the smearing function W(f) can also be expressed as

−πi(N−1)f∆t W(f)=sincN (f∆t) e . (61)

Also note that sinc (Nf) sincN (f)=N , (62) sinc (f) where sinc (f) is the diffraction function for continuous data in equation (39). Hence, sincN (f) ≈ N sinc (Nf) for large values of N. The function sincN (f)/N has a maximum value of 1 at f =0, zeros at the frequencies f = k/N, k = ±1, ±2,..., and oscillations of decreasing amplitude as N →∞.AsN becomes larger the sidelobes of sincN (f) become narrower and the diffraction function tends to become small everywhere, except at the origin where it is infinite (the same as a Dirac δ-function). In the limit as N →∞, the function sincN (f) can be used as a definition for the Dirac distribution δ(f).

3.3 The aliasing problem

The main difficulty of judicious sampling is concerned with the choice of the correct sam- pling rate ∆t. Suppose that a band-limited function x(t) is sampled with an interval ∆t< 1/(2fmax)=∆tmax, where fmax is the true cut-off frequency of the Fourier spectrum X(f)of x(t). This choice of ∆t corresponds to a Nyquist frequency fN =1/(2∆t) >fmax. The only effect of sampling x(t) at a smaller time interval than the optimal sampling rate ∆tmax (this is called ‘oversampling’) is that the ghosts of X˜(f) in equation (52) are displaced farther away from the central band [−fmax,fmax]. Although these ghosts are created automatically by the sampling, they are of no importance as long as the sampling is performed at a smaller rate than the ideal interval ∆tmax. 13

But what happens when x(t) is sampled at time intervals ∆t>∆tmax (‘undersampling’)? Suppose that harmonic waves cos 2π(f + nfN )t with frequencies f + nfN , n =0, ±1, ±2,..., are sampled at the time instants k∆t = k/(2fN ), k integer, with fN =1/(2∆t). Because k k cos 2π(f + nfN ) = ± cos 2πf (63) 2fN 2fN it follows that waves with frequency f +nfN (n integer) in the data, differing by exact multiples of the Nyquist frequency fN , cannot be distinguished and contribute to the same frequency f in the digitized data. Thus, if frequencies higher than the Nyquist frequency are actually present in the observations, i.e., fN

In view of the convolution theorem of Fourier transforms, equation (70) can be expressed as a convolution on the frequency axis, ∞ Xˆ(f)=∇(f;∆t) ⊗ X(f)= ∇(f − f ;∆t) X(f  ) df , (71) −∞

where ∇(f;∆t) is the Fourier transform of the infinite Dirac comb in time, ∞ ∇(f;∆t)= δ(t;∆t) e−2πift dt. (72) −∞

Substituting equation (66) we obtain ∞ ∇(f;∆t)=∆t e−2πikf∆t. (73) k=−∞

Comparing this result with equation (68) we conclude that the Fourier spectrum of an infinite Dirac comb in time is an infinite Dirac comb in frequency, ∞ k ∇(f;∆t)= δ f − . (74) ∆t k=−∞

Substituting equation (74) into (71) it follows that

∞ k Xˆ(f)= X f − . (75) ∆t k=−∞ This simple result has an important consequence. Even if an infinite sequence of sampled data were known, we can only have information about the ‘aliased’ Fourier transform Xˆ(f) and not about the true Fourier spectrum X(f)ofx(t). Note that Xˆ(f) is a cyclic function of frequency with period 2fN =1/∆t. Thus, if it happens that X(f) is zero outside the Nyquist band [−fN ,fN ], defined by the sampling rate ∆t =1/(2fN ), there will be no overlapping of the individual terms of X(f) in equation (75) and it is possible to reproduce the exact Fourier spectrum of the observed time function from its samples {xk}. On the other hand, if the sampling rate is so ineptly chosen that X(f) is not identically zero outside the Nyquist band [−fN ,fN ], then the dirty Fourier spectrum will have a principal aliased part ∞ k k Xˆ (f)=X(f)+ X f − + X f + , − fN ≤ f ≤ fN , (76) A ∆t ∆t k=1 given by a superposition of frequency bands of the true spectrum with width 1/(2∆t)and Xˆ(f) will most likely be different from X(f) in the Nyquist band. The frequencies f +2kfN , k =0, ±1, ±2,..., are called ‘aliases’ from one another, f being the principal alias, since they all contribute their energy to the same frequency f in the low-frequency domain. If power at frequencies higher than the Nyquist frequency fN =1/(2∆t) is actually present in the origi- nal record it will contribute to lower aliases with subsequent errors in the computed spectrum over the Nyquist band. Geometrically this effect of spectrum distortion can be interpreted as if the frequency axis would be folded in the points kfN , k =0, ±1, ±2,..., so that the energy contributions at the frequencies f +2kfN , k = ±1, ±2,..., all contribute to the frequency f in the low-frequency range [−fN ,fN ]. This is commonly called ‘spectrum folding’. Important aliasing effects are a straightforward consequence of an incorrect data sampling and cannot 15 be rectified afterwards because we only have access to the computed Fourier spectrum of the discrete observations of finite length. In any practical problem, two frequencies will be worth considering: (1) a guess as to the frequency fmax for which the total energy in x(t) beyond fmax is small; (2) a clear indication as to the largest frequency fN of interest in the time series at hand, bearing in mind that the sampling rate ∆t to be chosen is inversely proportional to fN . Ideally, one would like to choose ∆t =1/(2fN ), so that no frequencies beyond those of interest are considered of importance. However, if fmax >fN and if X(f) does not vanish for frequencies above the Nyquist frequency fN , then there will be aliasing problems. If fmax is known with some certainty, then aliasing can be severely reduced by first oversampling the observed record with the sampling rate 1/(2fmax). Since we actually work with a record of finite length, it may be difficult to choose a clear cut-off frequency in the Fourier spectrum, certainly when the noise level is rather high.

3.4 The Fourier line spectrum

If the sampling of the band-limited function x(t) occurs at only N time points tk, k =0, 1,..., N −1, separated by equal intervals ∆t, so that one knows the sampled data {xk}, the spectrum X(f) may be estimated by the familiar discrete Fourier transform, which follows from equation (52), N−1 −2πikf∆t X˜(f)=∆t xk e . (77) k=0 The data samples may be reproduced from the finite Fourier series of N terms

N−1 2πijt/(N∆t) x˜(t)= X˜j e , −∞≤t ≤∞, (78) j=0 which is a periodic function with period equal to the time span N∆t of the data samples. The Fourier coefficients required forx ˜(t)are

N−1 −2πijk/N X˜j =∆t xk e ,j=0, 1,...,N − 1, (79) k=0 so that X˜j = X˜(fj) at the discrete set of N equally spaced frequencies j fj = ,j=0, 1,...,N − 1. (80) N∆t Thus, the frequency resolution ∆f =1/(N∆t) is determined by the time span of the data. For discrete, equally spaced data and frequencies, Parseval’s theorem states that

N−1 N−1 2 2 ∆t (xk) =∆f |X˜j| . (81) k=0 j=0

The contribution |X˜j|2 is the power at the frequency fj and the plot of power versus frequency index j is the Fourier line spectrum

2 Pj = |X˜j| ,j=0, 1,...,N − 1, (82) also called the periodogram spectral estimate. Thus, the discrete spectrum Pj, based on a finite data set, is a distorted version of the continuous power spectrum P (f) based on an infinite data 16 set. An intuitive way of looking at the meaning of the set of frequencies defined in equation (80) is that the fundamental frequency, f1 =1/(N∆t), corresponds to a sine wave of period equal to the whole data interval T = N∆t. This is roughly the lowest frequency about which there is information in the data. On the other hand, the Nyquist frequency, fN , is approximately the highest frequency about which there is information, because ∆t is the shortest time interval spanned. Both (79) and its associated inverse transform (78) are cyclic, respectively with periods N and N∆t. Thus, by using (79), we have forced a periodic extension to both the discretized data and the discretized transform values, even though the original continuous function may not have been periodic. Since the Fourier transform of equally spaced data can be evaluated very economically with the Fast Fourier Transform algorithm or FFT (Brigham, 1974) and since uniform sampling is frequently the case, estimation of the spectrum from the Fourier line spec- trum (82) is very common practice when the measured process has deterministic components imbedded in random noise. However, it is emphasized that the Fourier line spectrum is not the spectrum of x(t), not even at the discrete frequencies fj. Rather, the Fourier line spectrum represents the spectrum of the periodic functionx ˜(t) in equation (78), which is

N−1 F{x˜(t)} = X˜j δ(f − fj). (83) j=0

Unless x(t), likex ˜(t), is a cyclic function with period equal to the span N∆t of the data samples, and unless x(t) is actually bandwidth limited, equation (83) does not give its true spectrum. An important consequence of uniform sampling is that the Fourier coefficient X˜j contains not only the contribution from the frequency fj but also from the frequencies fj ± 2kfN , where k is an integer. This aliasing of the spectrum occurs because the coefficients X˜j are periodic functions of frequency with period 1/∆t =2fN and are conjugate symmetric about the folding points ±kfN . Aliasing will take place unless the sampling interval can be made sufficiently small so that there is no energy in X˜(f) beyond the Nyquist frequency fN . In the absence of apriori information about the spectrum of x(t) it will be difficult to know if this requirement has been adequately met. We consider the topic how to represent the spectral properties of a discrete, unevenly sam- pled data set, ignoring the question of how to determine the continuous signal from which the data were obtained. To illustrate the modeling viewpoint of spectral estimation, the discrete periodogram estimate (82) of the continuous power spectral density P (f) will be shown to be equivalent to a least squares fit of a harmonic model to the data, namely the discrete Fourier series (Bloomfield, 1976). The decomposition of a finite discrete signal into harmonics is most easily formulated in terms of the vector space of possibly complex-valued data functions. Sup- pose that N samples (x0,x1,...,xN−1) of a continuous-time process x(t), taken at the time instants tk,k =0, 1,...,N − 1, are modelled by a discrete sequence {xˆk} composed of M complex sinusoids of preassigned frequencies fj,j =0, 1,...,M − 1, i.e.,

M−1 2πifj tk xˆk = Xˆj e ,k=0, 1,...,N − 1, (84) j=0 where the coefficients {Xˆj} have to be determined. Thus the signal x(t) over the observing interval [0,T] is represented with periodic functions, whether or not x(t) is itself cyclic. The weights {Xˆj} are assumed to be complex-valued for generality. Given the set of M frequencies {fj}, the spectral amplitudes {Xˆj} are determined by min- 17 imizing the total squared estimation error, 2 N−1 N−1 M−1 2 2πifj tk E = |xˆk − xk| = Xˆj e − xk . (85) k=0 k=0 j=0

Setting the derivatives of E with respect to Xˆ, =0, 1,...,M− 1, equal to zero, we obtain the normal equations for the amplitudes Xˆj: M−1 N−1 N−1 2πi(fj −f)tk −2πiftk e Xˆj = xk e , =0, 1,...,M − 1. (86) j=0 k=0 k=0

In principle, a spectral representation follows directly from computer inversion of the set of M linear equations in (86). On an uneven domain, the normal matrix (Cj) with elements

N−1 2πi(fj −f)tk Cj = e ,j, =0, 1,...,M − 1, (87) k=0 is generally not diagonal. In fact, this is a poorly conditioned M × M matrix, which is often numerically unstable to the usual Gauss-Jordan matrix-inversion routine. With M very large, not only is the matrix (Cj) difficult to invert, but it requires a large computer memory just for storage. Swan (1982) describes two techniques for recovering the discrete spectral coefficients from realistic unevenly spaced data sets. For uniformly spaced time instants tk = k∆t, k =0, 1,...,N − 1, and harmonically related frequencies fj = j/(N∆t),j =0, 1,...,N − 1, (i.e., M = N), it is well known that

N−1 2πi(j−)k/N Cj = e = Nδj (88) k=0

(Bloomfield, 1976), where δj is the Kronecker symbol: δj =1ifj = and δj =0forj = . This easily follows from equation (57) when substituting f∆t by (j − )/N . Hence, the spectral amplitudes Xˆj in (84) are given by the usual Fourier expansion for evenly sampled data,

N−1 1 −2πijk/N Xˆj = xk e , (89) N k=0 for j =0, 1,...,N−1. Equation (88) is just the orthogonality relationship of the basic functions exp (2πijk/N) for the DFT of regularly spaced data. Note that the matrix (Cj) in equation (87), being a function of the non-uniform tk, does not have an orthogonality property such as (88) but has non-zero off-diagonal elements. The power of the sinusoidal component at the preassigned frequency fj = j/(N∆t)isPj = |Xˆj|2. Thus the discrete periodogram spectral estimate may be viewed as being obtained by a least squares fit of a harmonic set of complex sinusoids with frequencies j/(N∆t),j =0, 1,...,N−1 to the data. The computational economy (Briggs and Henson, 1995) of the FFT algorithm has made this approach a popular one. The harmonic model preassigns the frequencies and the number of sinusoids so that only es- timation of the sinusoidal power is possible. Another noteworthy aspect of the harmonic model is the fact that noise is not accounted for in the model. Any noise present must also be modelled by the harmonic sinusoids. Thus, to decrease the fluctuations due to noise, one must average over a set of periodograms made from the data to obtain a representative estimate of the power spectral density. Note that an unrealistic tacit assumption of the conventional FFT spectral 18 estimate concerning the measured process is that the latter is cyclic outside the measurement interval. Therefore the distorting impact of the implicit window function on the frequency axis must be eliminated. Fourier transformation of an observed signal, whose opposite edges do not match, results in high-frequency contamination of the spectrum in the form of high intensities concentrated at the ends of the data interval (Gibbs phenomenon). This occurs because the FFT assumes periodicity and unmatched edges cause step discontinuities. Additionally, geophysical surveys often tend to have large gaps with no data, so that some type of infilling is required (Cordell and Grauch, 1982). The main statistical problem is that the periodogram is very noisy, even when the data are only slightly noisy (Scargle, 1982). Moreover, the noise does not diminish in amplitude with increasing sample size. The reason is that as more data are added, the number of available frequencies increases in proportion (see equation (80)), so the noise is not averaged out. What saves the periodogram in practice is that, as more data become available, even though the size of the noise remains large, the signal-to-noise ratio increases. The periodogram is very appropriate in problems involving the superposition of only a few simple periodic components.

3.5 Deconvolution of the dirty spectrum

Many of the problems of the Fourier spectrum estimation technique can be traced to the assump- tions made about the data outside the measurement interval. The finite data sequence may be viewed as being obtained by windowing an infinite length sample sequence {xk}, −∞ ≤ k ≤∞, with the box-car function do(t), defined in equation (35), which is zero outside the observing interval [0,T]. The truncated version xo(t)ofx(t) is further windowed by the finite Dirac comb in equation (25). The use of only this finite data set implicitly assumes the unmeasured data to be zero outside the window, which is in general not the case. A smudged spectral estimate is a consequence of this windowing. Note that the use of the Fast Fourier Transform implicitly supposes a periodic extension of the data outside the interval [0,T]. The multiplicationx ˆ(t)=do(t) x(t) of the actual time series x(t) by the data window do(t) implies that the overall Fourier transform Xˆ(f) is the convolution of the desired transform X(f) with the Fourier transform Do(f) of the window function, as in equation (37). If the true power of a signal is concentrated in a narrow bandwidth, this convolution operation will spread that power into adjacent frequency regions. This phenomenon, termed ‘spectral leakage’, is a consequence of the tacit windowing inherent in the computation of the Fourier line spectrum. Periodogram analysis leads to spectral estimates that are characterized by many ‘hills and val- leys’. Hence, there are two forms of spectral leakage of the power in the periodogram. Leakage to nearby frequencies (by the sidelobes of the spectral window) is due to the finite total interval over which the data is sampled. Leakage to distant frequencies (resulting from aliasing) origi- nates from the finite size of the interval between samples. The introduction of the FFT algorithm is generally credited to Cooley and Tukey (1965) and renewed an interest in the periodogram approach to power spectral density estimation. Conventional FFT spectral estimation is based on a Fourier series model of the data, that is, the process is commonly assumed to be composed of a set of harmonically related sinusoids as in equation (84). The FFT approach to spectrum analysis is computationally efficient and produces reasonable results for a large class of signal processes. In spite of these advantages, there are several inherent performance limitations of the FFT technique. The most prominent restriction is that of limited frequency resolution, i.e., the ability to distinguish the spectral responses of several signals. The frequency resolution is roughly the reciprocal of the time in- terval T over which sampled data are available. A second shortcoming is due to the implicit 19

windowing of the data that occurs when processing with the FFT. In general, the convolution of the real spectrum X(f) with the spectral window W(f) results in the smearing of a given feature in X(f) over the ‘width’ of W(f), that is, the range of fre- quencies, around f = 0, where W(f) is appreciable. Windowing manifests itself as a ‘leakage’ in the spectral domain, that is, energy in the main lobe of a spectral response ‘leaks’ into the side- lobes, obscuring and distorting other spectral responses that are present. In fact, weak spectral signal contributions can be hidden by higher sidelobes from stronger spectral responses. These two performance limitations of the FFT approach are particularly troublesome when analyzing short data records. Also, many measured processes have slowly time-varying spectra that may be considered constant only over short record lengths. In addition to the distorting effects of leakage on the spectral estimate, leakage has a detri- mental impact on power estimation and detectability of sinusoidal components. Spectral side- lobes from adjacent frequency cells add in a constructive or destructive manner to the main lobe of a response in a bordering frequency cell of the spectrum, affecting the estimate of power in that cell. In extreme cases, the sidelobes from strong frequency components can mask the main lobe of weak frequency responses in neighbouring cells, as demonstrated in Figure 1. Sidelobes characteristic of the (sin πf)/(πf) diffraction function (the Fourier transform of a rectangular time-domain window) are evident in this illustration. Leakage problems are difficulties with the use of the unsmoothed periodogram, not with the use of the periodogram itself, and not with the extension to uneven data spacing. Data windowing is also the fundamental factor that determines the frequency resolution of the periodogram. The convolution of the actual signal transform, X(f), with the window transform, W(f), implies that the most narrow spectral response of the resulting transform Xˆ(f)=W(f) ⊗ X(f) is limited to that of the main-lobe width of the window function, indepen- dent of the data. For the rectangular data window do(t), the main-lobe width between 3-dB levels (and therefore, the resolution) of the associated (sin πf)/(πf) transform is approximately the inverse of the observation time of N∆t time units. Leakage effects due to data window- ing can be reduced by skilful selection of tapered data windows or windows with non-uniform weighting (Harris, 1978), but always at the expense of reduced frequency resolution. The multiplication of the data by a function which goes smoothly to zero at the ends of the observing interval is equivalent to convolving the spectrum with the corresponding spectral window function. Such convolution reduces the variance of the spectral estimate because it averages (smooths) the spectrum. At the same time, spectral leakage can be controlled, since the window function can be chosen (‘tailored’, or even more colourfully, ‘carpentered’) so that the amplitudes of the disturbing sidelobes are reduced. Harris (1978) presents graphs of the sidelobe structure of 45 data windows. This is known as ‘window carpentry’. Nevertheless, the price paid for a reduction in the sidelobes of the spectral window is always a broadening in the main lobe of the window transform, which in turn means a decrease in the resolution of the spectral estimate. The frequency resolution ∆f which results from a discrete data set is the width of the main peak of W(f)atf = 0. As long as the data sampling is not too non-uniform, the width of W(f) will be similar to that of Do(f) in equation (36). So it is roughly the reciprocal of the time interval of length T over which sampled data are available, 1 ∆f ≈ . (90) T A consequence of a finite data set is that there will be peculiarities in W(f) which are due to the detailed distribution of the sample points {tk}. Convolution of X(f)withW(f) to produce the dirty spectrum Xˆ(f) will lead to ‘false’ features in Xˆ(f), i.e., characteristics due to the specific 20 structure of W(f), which will confuse the identification of the ‘true’ frequencies in Xˆ(f), i.e., those due to the real structure of X(f). In the limit as T →∞, W(f) behaves like the Dirac kernel δ(f), showing that Xˆ(f) ≈ X(f) for large T . There is a common misconception that zero-padding the data sequence before transform- ing will improve the resolution of the computed spectrum. Transforming a data set, extended with zeros, only serves to interpolate additional Fourier transform values within the frequency interval −1/(2∆t) ≤ f ≤ 1/(2∆t) between those that would be obtained with a non-zero- padded transform. The additional values of the periodogram, computed by a FFT applied to the zero-padded data set, will fill in the shape of the continuous-frequency spectrum as defined by expression (77). In no case of zero-padding, however, is there an improvement in the funda- mental frequency resolution ∆f, given in equation (90). Zero-padding is useful for smoothing the appearance of the periodogram estimate via interpolation but adds no information. Mathematically, an M-point discrete Fourier transform of an M-point sequence (x0,x1,..., xM−1) is, in general, equated from (79), M−1 −2πijk/M X˜j =∆t xk e (91) k=0 for j =0, 1,...,M − 1. If the original data set of length N has been zero-padded with M − N zeros, i.e., xk =0fork = N,N +1,...,M − 1, then (91) becomes

N−1 −2πi(jN/M)k/N X˜j =∆t xk e , (92) k=0 which is the same as the N-point transform (79), but evaluated over the interval −1/(2∆t) ≤ f ≤ 1/(2∆t)atM (>N) frequencies fj = j/(M∆t), j =0, 1,...,M − 1. By eliminating the operations on zeros introduced by zero-padding, or pruning as it is called, a more efficient FFT algorithm is possible (Markel, 1971). According to equation (27), the convolution of the true Fourier spectrum X(f) with the spectral window W(f) produces the dirty spectrum Xˆ(f). A different approach to the leakage problem is to try to remove it from the spectrum. Our goal is to estimate X(f)fromour information on Xˆ(f)andW(f), in other words, to ‘undo’ the damage inflicted by our incomplete knowledge of the function x(t) as far as possible, given the finite frequency resolution imposed byasamplingofx(t) at a finite number of time instants. Note that the formal solution of equation (27) using the Fourier convolution theorem would lead us to write

X(f)=F{x(t)} = F{xˆ(t)/w(t)} = F{xˆ(t)}⊗F{1/w(t)}. (93)

However, because we know x(t) only at discrete times, the weighting function w(t)isonlydif- ferent from zero at the sample points. So, equation (93) in itself makes no sense. In fact, the solution for X(f) by the deconvolution of the dirty spectrum Xˆ(f) is an example of an known not to have a unique solution (Parker, 1994). No finite set of data is enough to specify unambiguously the generating function x(t). Finally, even equally spaced sampling can lead to Fourier line spectra which are hard to interpret due to the confusion of characteristics in X(f) with those arising from the features in W(f). Often large, roughly equally spaced gaps are introduced in the data sequence. Even though the existence of missing data will result in a very complicated spectral window, the temptation is just to plug ahead with the computation of Xˆ(f) from equation (28), and to hope that this gives a reasonable approximation to the true spectrum X(f). However, this may be a very misleading procedure because the convolution in equation (27) can introduce spuri- ous features into Xˆ(f). It would be useful to ‘clean up’ the dirty spectrum by removing these 21 unauthentic peculiarities introduced implicitly by the spectral window. In the next section we describe an approximate, non-linear iterative solution by which this can be accomplished. The CLEAN technique to dispose of the puzzling effects of the sidelobes of W(f), introduced by the specific distribution of the time points {tk} at which x(t) is sampled, is useful for equally spaced as well as unequally spaced data. In summary, the customary periodogram approach to spectral estimation has the following advantages: (1) computationally efficient if the FFT is used, and (2) power spectral density estimate directly proportional to the power of sinusoidal processes. The disadvantages of this technique are: (1) amplification or suppression of weak signal main-lobe responses by strong signal sidelobes, (2) frequency resolution limited by the available data record duration, indepen- dent of the characteristics of the data or their signal-to-noise ratio, (3) introduction of distortion in the computed spectrum due to sidelobe leakage, and (4) need for some sort of pseudo ensem- ble averaging to obtain statistically consistent periodogram spectra for stochastic processes. As geophysical data are usually collected with irregular sampling, they must first be inter- polated onto a regular lattice prior to application of the FFT. This raises a number of issues including the choice of the interpolation method, infilling to ensure that there is a value at every node, and matching of opposite edges of the data interval to avoid Gibb’s effects (Cordell and Grauch, 1982). The interpolation process introduces unknown problems in interpreting the computed spectrum, but makes available all of the numerical techniques for the analysis of regularly spaced data. Nonetheless, the high speed of the FFT and the ease of discrete Fourier domain operations can mask a fundamental point, namely that the FFT is only a disturbed approximation to the true continuous Fourier transform.

4. The clean spectrum

4.1 The spectrum of a single harmonic component

In practice, the dirty spectrum Xˆ(f) and window function W(f) are evaluated on finite ar- rays of discrete points. From a computational point of view, the kth data point is now assigned a weight wk =1/N for all k, so that the dirty spectrum and the spectral window in equations (28) and (29) are rewritten as

N−1 1 −2πiftk Xˆ(f)= xk e (94) N k=0 and N− 1 1 W(f)= e−2πiftk . (95) N k=0 In general, we assume that the dirty spectrum Xˆ(f) is determined at m positive frequency points, i.e.,at(2m + 1) equally spaced frequencies in the interval [−fmax,fmax], j fj = f ,j= −m,...,m, (96) m max where fmax is the maximum frequency in the spectrum. Here, zero frequency or direct component (DC) is at j = 0, and negative frequencies are labelled with j<0. The DC component of the dirty spectrum is usually removed by subtracting the arithmetic mean from the data before Fourier transformation. In order to CLEAN the dirty spectrum Xˆ(f), the spectral window 22

W(f) must be determined on (4m + 1) points in the frequency interval [−2fmax, 2fmax]. This is evident from equation (27) when the integration is restricted to the Nyquist band in the case of uniform sampling: f N    Xˆ(f)= W(f − f ) X(f ) df , − fN ≤ f ≤ fN . (97) −fN

The clean spectrum, S(f), will be computed at the same frequency points as Xˆ(f). In fact, the symmetry of Xˆ(f), W(f), and S(f) enables us to calculate only the positive-frequency parts. For simplicity we take the maximum frequency to be 1 fmax = , (98) 2∆tmin where ∆tmin is the smallest interval between adjacent time samples. The number of positive- frequency points, m, is related to the desired number of points, nB, in the frequency interval corresponding to the main lobe of the spectral window W(f), and to the frequency resolution, ∆f ≈ 1/T ,bynB =∆f/(fmax/m), so

fmax T m = nB ≈ nB , (99) ∆f 2∆tmin where T is the total length of the data span. In the case of N equally spaced data this reduces to m ≈ nB(N/2). In most cases we use nB = 4 and have m ≈ 2N. Detection of a periodic signal hidden in noise is frequently a goal in time series analysis. The deconvolution of equation (97) sets out to obtain an approximation of the true spectrum X(f) of the observed time function. We consider first a single strictly periodic constituent x(t)=Aˆ cos (2πftˆ +ˆϕ) of frequency fˆ, constant harmonic amplitude Aˆ, and phaseϕ ˆ. The exact spectrum of x(t), given in equation (47), is a complex function of frequency and consists of a pair of spectral maxima at the frequencies ±fˆ with equal spectral amplitudes of A/ˆ 2. The phases of the spectral components at the positive and negative frequencies ±f are ±ϕˆ, respectively. The sampled data lead to a dirty spectrum consisting of two terms,

Aˆ Xˆ(f)=ˆaW(f − fˆ)+ˆa∗ W(f + fˆ), witha ˆ = eiϕˆ. (100) 2 The normalized spectral window W(f) in (95) consists of a central peak at f = 0, plus secondary peaks which are displaced from the central peak. Although the central peak is always the largest feature in the spectral window, the largest sidelobes can be a substantial fraction of the central peak (22% in Figure 1). Therefore, the dirty spectrum contains both the true positive and negative frequency peaks at ±fˆ and false peaks due to the sidelobes of these primary peaks. The application of CLEAN to mitigate the inherent limitations of the FFT approach pursues the object of subtracting functions of the form of the spectral window from the dirty spectrum. In the dirty spectrum, a single harmonic component at frequency fˆ produces two maxima, at ±fˆ. Because each spectral peak is contaminated by the sidelobes of the other, the amplitude and frequency of a true component cannot be determined merely by using the amplitude and frequency of the maxima on the dirty spectrum. Nevertheless, we can take the contamination into account by noting that in the case of a spectrum containing a single harmonic element, the dirty spectrum takes the value

Xˆ(fˆ)=ˆaW(0) +a ˆ∗ W(2fˆ) (101) at the frequency fˆ of that component. Using the symmetry relationships of Xˆ(f)andW(f) and the fact that the normalized window function in equation (95) is unity at the origin, i.e., 23

W(0) = 1, we can find the spectral amplitude and phase of that component in terms of its frequency fˆ: ∗ Xˆ(fˆ) − Xˆ (fˆ) W(2fˆ) aˆ = . (102) 1 −|W(2fˆ)|2 Because of the contamination of the positive and negative spectral peaks by the sidelobes of the other, it is in general not possible to determine the precise location of the frequency of even a single harmonic component from a dirty spectrum. Given the frequency f of a clean component estimated from the dirty spectrum Xˆ(f), its complex amplitude is estimated from a function defined analogous to equation (102), ∗ Xˆ(f) − Xˆ (f) W(2f) a(f,Xˆ)= , (103) 1 −|W(2f)|2 asa ˆ ≈ a(f,ˆ Xˆ). If CLEAN is performed judiciously, the small errors in the frequency, amplitude, and phase of the clean components selected in this way will be corrected in successive iterations. The effects of the sidelobes of the spectral window will be removed sequentially by subtracting only a fraction of the contribution of the component at frequency f in each iteration. Never- theless, if multiple harmonic waves are present in the observed function, it is possible that the dirty spectrum may take its maximum value at a frequency away from any real constituent. In this case, CLEAN will try to subtract a non-existing component, and it may fail. Schwarz (1978) develops criteria to decide under which conditions CLEAN works.

4.2 Computational details of the CLEAN algorithm

To detect all the spectral components in a dirty spectrum Xˆ(f), remove the sidelobes of the spec- tral peaks originating from the frequency structure of the spectral window W(f), and construct a ‘clean spectrum’, the methodology for finding the spectral component of a single cosinusoid in equation (102) is now iterated (Roberts et al., 1987). With the notation R0(f)=Xˆ(f)and using subscripts to denote array elements and superscripts to identify the order of iteration, we can implement the CLEAN technique in the following steps. (1) Starting the iteration with n = 1, that is, beginning with the dirty spectrum Xˆ(f), we find the frequency fpeak of the nth clean component from the maximum of the modulus |Rn−1(f)| of the previously obtained ‘residual spectrum’ Rn−1(f). Here we make the plausible assumption that this spectral peak is mainly due to a real signal and that only a minor part comes from the window function W(f). Denote the positive-frequency bin which corresponds to fpeak by in, so that fpeak = fin . The spectral amplitude of the component at the frequency fin is estimated from equation (103): n−1 − n−1 ∗ n−1 R (fin ) R (fin ) W(2fin ) i a(f n ,R )= 2 . (104) 1 −|W(2fin )|

n−1 Call a(fin ,R )=ain . (2) On the nth iteration, the contribution of this component, including the effect of the unwanted secondary responses introduced by the spectral window W(f), is removed by sub- tracting a response of the form of equation (100) from the residual spectrum Rn−1(f) for all frequencies f. Only a fraction g of the contribution ain of the component at the frequency fin to the dirty spectrum is subtracted from Rn−1(f), forming the residual spectrum Rn(f), n n−1 − ∗ − Rj = Rj g ain Wj−in + ain Wj+in ,j= m,...,m, (105) 24

n n−1 n n−1 where Rj , Rj and Wj simply denote the values of R (f), R (f)andW(f) at the frequency fj, respectively. This clean component gain is added to the (2m + 1) element clean components n array {Cj },j = −m,...,m,at both the positive and negative frequencies. This affects elements 0 at the frequency bins ±in,sowithCj =0,

n n−1 Cin = Cin + gain (106) and n n−1 ∗ C−in = C−in + gain . (107)

Note that a zero-frequency (DC) clean component (in = 0) is handled correctly by this pre- scription. In order to prevent small errors from destabilizing the CLEAN procedure, it is better not to remove the full value of the amplitude of the highest peak of the previous residual spectrum, but only a fraction g of the response due to this component is subtracted from Rn−1(f)atone time. Using the ‘clean gain’ g, the amplitude of the component to be subtracted is gain , thereby removing a great deal of the unwanted secondary responses of the data window. Removal of nearly all of the response will occur as the process is iterated. This restriction of the size of each step in the CLEAN procedure is necessary because if there is more than one peak in the dirty spectrum, and anytime there is noise, each subtraction will be slightly in error. Overcor- rection in an early stage is later adjusted automatically, because the method allows corrections of negative amplitudes. These errors are due to contributions from the sidelobes of components which have not yet been CLEANed away, and to the contributions from noise. Typical values of g lie between 0.1 and 1. Values at the bottom of this range require more iterations, but should provide more stability. The use of a clean gain less than 1 permits CLEAN to ‘correct’ these small errors, often by putting a ‘negative’ clean component to compensate for a previous oversubtraction. Gains too close to 1 can produce errors which cannot be recovered by subsequent iterations, while very small gains simply take too long. Low values of g may be advisable in the analysis of spectra containing significant power over a band of frequencies considerably larger than the frequency resolution. Experience has shown that a gain of a few tenths is usually optimal. (3) After completion of the nth step in this iterative process we have a set of n clean com- n n ponents, {Cj }, and a residual spectrum, R (f), which is the result of removing these clean n components and the effects of the sidelobes of W(f) from the dirty spectrum. Both {Cj } and Rn(f) still contain all of the information present in the signal. The procedure is repeated on the remaining residual spectrum Rn(f) and by successive iterations one builds up a set of clean components. At the end, the resulting clean components will contain the amplitudes of the real characteristics of the Fourier spectrum X(f), but none of the spurious features introduced into the dirty spectrum Xˆ(f) by the sidelobes of the window function W(f). (4) The iteration must be halted by some ‘stopping condition’. If the stopping condition is not met, the iteration is continued at step (1) until nothing is left but ‘noise’. Otherwise, exit at step (5). To conclude that the procedure has converged, we examine the residual spectrum and the accumulated clean components. One or more of the following tests might be included: the maximum of Rn(f) is less than some predetermined ‘noise’ value, the sum of the moduli of the clean components is not increasing significantly, or the number of iterations has reached some prescribed limit. As a rule of thumb, it is recommended continuing the iteration for the search of clean components until (the gain) × (the number of clean components) is comparable to the number of frequencies for which the dirty spectrum has been computed. (5) When convergence is reached (after n iterations, say), we have obtained a set of clean n n components {Cj },j = −m,...,m, and the residual spectrum Rj at the frequencies fj.Ifno 25

n spectral component was found at the frequency fj,say,wehavesetCj = 0. After n iterations at a gain g (< 1), the peaks due to the sidelobes of the spectral components are expected to be reduced to a fraction ∼ gn of their original height. At the end, all of the false peaks visible in the initial periodogram are expected to be removed by the CLEAN procedure, leaving only the true positive frequency peaks. (6) After all of the spectral clean components have been extracted, the hypothetical ‘clean beam’ B(f) is constructed by a least-squares fitting of the weights (B0,B1,...,Bp−1) to the first p values (W0,W1,...,Wp−1) of the window function W(f). With N =2m and nB =4in equation (99) we can take p = 6. The clean beam B(f) is a window function constructed to have the same frequency resolution as the dirty beam W(f), but is free from the sidelobes of W(f). For instance, B(f) has the form of a Gaussian function exp(−βf2), with the coefficient β chosen by numerical fitting to match the main lobe of W(f):     p−1 p−1  2   4 β = fj ln Wj / fj . (108) j=0 j=0

Also the non-negative ordinates of the main bandwidth of W(f) could be used for the weights of the clean beam B(f). The clean beam (B0,B1,...,Bp−1) may be taken to be real and n is normalized so that B0 = 1. Note that the clean components {Cj } inherently involve a clean beam (1, 0,...,0). At last, the ‘clean spectrum’ S(f) is formed by convolving the clean- components array with the discrete filter consisting of the clean beam weights and adding the final residual spectrum:

p−1 n n n n Sj = Cj + Bk Cj−k + Cj+k + Rj ,j=0, 1,...,m. (109) k=1 The effect of convolving the clean components with a clean beam is to weigh down the high- frequency terms of the estimated spectrum, which are the most uncertain ones. According to equation (47) the clean amplitude spectrum is obtained by multiplying all of the Sj by a factor 2. Even though it may seem superfluous, step (6) is important because the convolution of the clean components with the clean beam makes the frequency resolution of the clean spectrum ∼ 1/T , rather than the artificial resolution of the clean components spectrum, which is the fre- quency spacing of the discrete representation used in the calculation. To preserve the noise level of the spectrum, which is partly of numerical origin, and to include any spectral feature not well represented by the spectral clean components, the residual spectrum, remaining at the end of the CLEAN iteration, is added to form the clean spectrum. It is possible to ‘super-resolve’ S(f) by choosing B(f) to be narrower than W(f), or, in the extreme, by using the clean components themselves as a direct model. However, this option is not recommended for it depends on the algorithm’s extrapolation of the data to the region outside that measured, which cannot be expected to be very reliable.

4.3 The frequency range and resolution

Accurate determination of the clean components and the removal of their sidelobes from the dirty spectrum of a set of data requires that the window function W(f) satisfy two simple citeria: (1) in order to make it possible to remove a component anywhere in the dirty spectrum, which is computed in the Nyquist band −fN ≤ f ≤ fN , W(f) must be known over −2fN ≤ f ≤ 2fN , 26 and (2) in order to locate precisely each clean component, the central peak of the beam must be well defined. This requires that in the discrete realization of W(f)andXˆ(f) there be several frequency points spanning the fundamental frequency resolution ∆f. The ability to discriminate between neighbouring peaks in a spectrum is a function of the intrinsic resolution ∆f ≈ 1/T and the signal-to-noise ratio of the spectrum. Although CLEAN does not actually enhance the resolution of a spectrum, it can make it a great deal simpler to separate close components since it adequately removes the confusion created by the sidelobes of important spectral peaks. A word has to be said about the case of non-uniform sampling. The maximum frequency for which spectral information may be extracted from a given data set is exactly determined in the case of equally spaced data, but for randomly distributed time points {tk} the sampling theorem can only be used as a guiding principle. If the data samples are otherwise equally spaced in time but with missing points, the theorem tells us that the data completely determine a time function whose Fourier transform is zero for |f| > 1/(2∆tmax), where ∆tmax is the largest com- pletely sampled data spacing. However, there are smaller spacings between irregularly sampled observations, and these certainly carry information about the energy at frequencies greater than 1/(2∆tmax). Under some circumstances, samples made at random times over a given interval can be more advantageous than the same number of samples made at equally spaced times over the same interval (Swan, 1982). In addition to superior rejection of aliasing, a wider range of frequencies can be sampled, and the window function resulting from unequal sampling will usually have smaller secondary maxima than will the window function for regular sampling. Thus it may be beneficial to work with unequally spaced data. Some information is available in the spectrum about frequencies as high as 1/(2∆tmin), where ∆tmin is the smallest spacing between data points. Furthermore, if the time points {tk} are more or less randomly distributed, so that a wide range of spacings are present and there is little redundancy in the spacing between various points, it is expected that significant details will be present on frequencies greater than 1/(2∆tmin). Nevertheless, it seems safe to restrict the analysis to frequencies obeying |f|≤fmax =1/(2∆tmin). The minimum frequency is roughly that for which one cycle takes place during the timespan fmin ≈ 1/T . In the case of no gaps, this is related to the maximum frequency by fmin =1/(N∆t)=(2/N )fmax. It is worthwhile to reconsider the interpretation of the observed spectrum Xˆ(f) and the clean spectrum S(f) and their association with the true spectrum X(f) of the input function x(t). Both Xˆ(f)andS(f) result from over X(f) in the frequency domain, and can be thought of as averages of X(f) over their respective windows, W(f)andB(f). It is readily concluded that the dimensions of Xˆ(f)andS(f) are the same as those of the input function, whereas the dimensions of X(f) are those of the input divided by frequency. Thus, even in the limit of infinite sampling neither Xˆ(f)norS(f) approaches X(f). We have chosen a normalization for W(f)andB(f), both of which are dimensionless, such that they are unity at f = 0. As a consequence, the areas of these ‘beams’ are not normalized, and in the limit of infinite sampling they do not approach Dirac delta functions. This normalization is, however, the most suitable when the main purpose is to find harmonic components. If the signal con- tains an isolated harmonic constituent of amplitude Aˆ, then features with modulus A/ˆ 2 will be found at the appropriate frequencies in both Xˆ(f)andS(f). The spectrum X(f) may then be reconstructed according to equation (47). The situation is more complicated if the input signal has a pseudo-periodic content which extends over a band of frequencies. To find the contribution to the power spectrum between two frequencies, the functions |Xˆ(f)|2 and |S(f)|2 must be integrated over the appropriate positive and negative frequency intervals and then divided by the effective area of the beam. For the clean spectrum, S(f), this area is the integral of B(f), which is ∼ ∆f. For the dirty spec- trum, Xˆ(f), however, this procedure is ambiguous, since the integral of W(f) will typically be 27

either zero or infinity. Integrating the response only over the completely positive ‘main lobe’ will be roughly correct only if, as is commonly the case, the amplitude of the main lobe of the dirty beam W(f) is much larger than over the rest of the beam. This procedure would be rendered even less reliable because the window function W(f) usually has relatively large secondary responses. Integration over the CLEAN function would be the preferable technique. An alternative, if the residual spectrum is negligible, is to directly sum the clean components, in which case no normalization correction will be necessary.

5. An experiment with CLEAN

The case of measurements of a daily variation with gaps is quite typical in geophysics. To illustrate the use of CLEAN, Figure 2 shows a synthetic function consisting of the sum of four cosinusoids of various amplitudes, Aˆn, periods Tˆn, and phasesϕ ˆn, which is sampled with a ˆ ˆ − o ˆ ˆ constant interval ∆t = 1 hour, where A1 = 24, T1 =24h,ˆϕ1 = 132 , A2 = 12, T2 =12h, o ˆ ˆ − o ˆ ˆ o ϕˆ2 =46 , A3 =8,T3 =8h,ˆϕ3 = 11 ,andA4 =6,T4 =6h,ˆϕ4 =34 . A long-term trend is simulated by adding the function −10 + 0.07 t − 510−5t2 to the data, which are also corrupted with a generated from uniformly distributed random numbers with a variance of 2, giving a signal-to-noise ratio of 197. Starting initially with 660 regularly spaced data, large gaps are introduced to reproduce a set of spectral sidelobes of the main peaks for the purpose of demonstration. The ‘on’ periods of 60 time points were alternated by ‘off’ periods including 40 ‘missing’ data, giving 420 data in all for a CLEAN analysis. The red curve in Figure 2 shows the artificial time series before data rejection and the dots represent the actual input ‘observations’.

Figure 2: Time series (red curve) with gaps (dots) containing four harmonic components of periods 24, 12, 8, and 6 hours. 28

Figure 3: Modulus of the spectral window function.

Figure 4: The dirty spectrum of the time series in Figure 2.

Figure 5: The clean amplitude spectrum after 395 iterations with a gain g =0.5. 29

The modulus of the resulting spectral window function in Figure 3 has large and periodic sidelobes due to the sharp, regular modulation of the sampling. Note that the first sidelobe of |W(f)| is about 50% of the height of the main lobe. The dirty spectrum in Figure 4 contains the positive frequency peaks at 1, 2, 3, and 4 cycles/day (cpd) and false features due to the sidelobes of these peaks, which might be taken to be real, making its interpretation somewhat hazardous. The strongest false peaks (displayed from the true peaks by ±0.24 cpd and ±0.48 cpd) are of substantial amplitude and confuse the interpretation of the spectrum. Application of CLEAN to the dirty spectrum is illustrated in Figure 5, where we show the clean spectrum which results from 395 iterations at a gain of g =0.5. Here the stopping condition on the residual sum of squares has ended the CLEAN iteration. As a hoped-for result, all of the false peaks visible in Figure 4 are removed by the CLEAN procedure, leaving only the true positive frequency peaks. The low level noise in Figure 5 is numerical in origin.

Figure 6: Data reconstruction by Fourier inversion of the dirty spectrum (red curve).

Figure 7: Data reconstruction by Fourier inversion of the clean spectrum (red curve).

The difference between dirty and clean spectra is that the dirty spectrum implies that where data are missing they are zero, whereas the clean spectrum interpolates across missing data points. This is clearly illustrated by inverting the spectra, i.e., Fourier transforming the spectra back to the time domain. Inversion of the dirty spectrum is shown in Figure 6. This is accomplished with the generalization of equation (78) for unequal numbers of time and 30 frequency points, N−1 N−1 N 2πifj t N 2πifj t xˆ(t)= Xˆj e = Xˆ(fj) e , (110) M M j=0 j=0 where N and M are the number of data points and frequency samples, respectively. Figure 6 shows that equation (110) reproduces the input data exactly: the inverted dirty spectrum passes through each data point. In addition, inversion of the dirty spectrum reflects the sampling and the missing data intervals, in the sense that the inverted spectrum also matches the sampling window. Note that the arithmetic mean is eliminated from the data. The inverse Fourier transform of a clean component is a single cosinusoidal term, so inver- sion of a set of clean components is just summing up cosines of known amplitude, frequency, and phase. Figure 7 shows the Fourier inversion of the CLEANed spectrum. It is obvious that the clean spectrum mainly fits the data but not the sampling window. Figure 7 also illustrates the error correction built into the CLEAN algorithm. While there are only four components of the time series in the diagram, detailed inspection of Figure 7 shows that inversion of the sum of the clean components (made with a gain 0.5) does not quite reproduce the input data. This is because each of the CLEAN iterations was somewhat in error, owing to the contribution at each spectral peak of the sidelobes of the other cosinusoids not yet removed. However, succes- sive iterations of CLEAN correct these small errors, and the inversion of the final result after iteration reconstructs the input time series almost exactly.

6. The CLEAN geomagnetic spectrum

The source data used in the spectral analysis consist of the hourly mean values of magnetic declination (D component), measured at the station of Dourbes, Belgium (50o 06’ N,4o 36’ E) for the 40-years interval January 1, 1960 to December 31, 1999. The values of D are given in minutes of arc and the first value of each day corresponds to 00.30 UT. At the dip latitude of Dourbes (51.7o N) periodic variations in the frequency interval from 0 to 4 cycles per day are found to be most conspicuous in the declination (De Meyer and De Vuyst, 1982; De Meyer, 2001, 2003). So this magnetic element was chosen for the investigation of periodic variability in a low signal-to-noise ratio time series with the help of the CLEAN procedure. The geomagnetic field measured at any point on Earth as a function of time shows peri- odic variations due to atmospheric processes caused by the Sun and the Moon. Both S and L, respectively the solar and lunar daily geomagnetic variations, are commonly assumed to result principally from electric currents generated by dynamo action in the ionosphere, which is generally a much better conductor during the day than during the night, except in the auroral regions. This dynamo is powered by tidal movements of the ionosphere across lines of force of the Earth’s main magnetic field, the tides being of mainly thermal origin of S and of purely gravitational origin of L. The currents flow partly in the ionosphere, partly along lines of force, and also within the Earth itself as a result of induction. Lunar semi-diurnal variations have been found in such meteorological data as the baromet- ric pressure, the air temperature and the wind, also in some cosmic ray data. The geomagnetic field and the ionosphere undergo more complicated lunar daily variations. These phenomena are worthy of much effort towards the attainment of a complete understanding of them. The meteorological lunar daily variations reveal aspects of the large-scale dynamics of the atmo- sphere under gravitational accelerations completely known in their space and time distribution. The reason of the irregular distribution of the meteorological lunar daily variations lies in the world-wide structure of the atmosphere, namely its changing geographical and height distribu- 31 tion of temperature, density and winds. Both the solar daily variation, S, and the lunar daily geomagnetic variation, L,havebeen known to vary with season (Black, 1970). The seasonal changes of S, particularly in its quiet day form, Sq, have been studied in much detail, but the seasonal changes of L are much less well known. An annual variation in the lunar geomagnetic tide is indeed expected because the strength of the lunar current system depends on the conductivity of the ionosphere. There is a small annual variation in the ionospheric conductivity, arising from the Earth’s elliptic orbit, since solar radiation incident on the Earth’s atmosphere is more intense when the Sun is nearer the Earth, with a resulting increase in ionization density. In addition, the lunar tidal winds depend on the gravitational driving force, containing an annual term arising from the Earth’s driving elliptic orbit. The lunar variations, L, are given in terms of the amplitudes n and phase angles λn of the first four harmonics Ln, n = 1, 2, 3 and 4, and are represented by Chapman’s phase law

4 L = n sin (nt − 2ν + λn) (111) n=1 (Malin and Chapman, 1970), where ν = t − τ is a measure of the phase (or age) of the Moon. Here t denotes mean solar time, reckoned in hours from local midnight, and τ is mean lunar time, measured in hours from local lower transit of the mean Moon. The L variations are masked by the much greater solar daily variations, S = n Sn, which are similarly given in terms of the amplitudes sn and phases σn of the main harmonics Sn as follows: 4 S = sn sin (nt + σn). (112) n=1

The frequency of Sn is n cycles per mean solar day and that of Ln is n − 2/M , where M = 29.530588 is the number of solar days in a lunar month. Thus the periods of the first four lunar component variations are as follows: L1, 25.74352 solar hours; L2, 12.42060 solar hours; L3, 8.18478 solar hours; L4, 6.10334 solar hours.

Figure 8: Clean spectrum of the declination in the diurnal frequency interval.

The results in this section show the main features of the magnetic declination spectrum. The sampling interval in the observations, ∆t, is 1 h, which gives a Nyquist frequency fN =1/(2∆t) = 12 cpd. Using N = 350640 hourly means, the resolution in frequency is ∆f =1/(N∆t)= 32

Figure 9: Clean spectrum of the declination in the semi-diurnal frequency interval.

Figure 10: Clean spectrum of the declination in the ter-diurnal frequency interval.

Figure 11: Clean spectrum of the declination in the quarter-diurnal frequency interval. 33

6.8446 10−5 cpd. In general, a pure sinusoidal signal in the data will appear in the spectrum as a band of width ∼2∆f. Detailed views of the spectral content of the diurnal (D), semi-diurnal (SD), ter-diurnal (TD) and quarter-diurnal (QD) bands are displayed in Figures 8, 9, 10, and 11. The spectral peaks are conspicuously resolved because of the very extended horizontal scale. The S1, S2, S3 and S4 peaks, respectively at 1, 2, 3 and 4 cpd, stand out sharply above the background continuum and they are obviously associated with the daily solar variation and its main harmonics. Analysis at this resolution splits the solar peaks into lines Sn and sidebands at the frequen- cies (n + k/365.25) cpd, n =1, 2,...; k = ±1, ±2,..., originating from the modulation of the solar daily variation by the annual fluctuation and its harmonics (De Meyer, 1980, 2001, 2003). These sidebands, denoted by kSn, n =1, 2,...; k = ±1, ±2,...,havearealphysicalorigin and are not to be confused with the sidelobes of the spectral window used in the spectrum computation. Owing to the changing declination of the Sun throughout the year, the thermal excitation harmonics include, in addition to the pure diurnal and sub-diurnal terms with time arguments t,2t,3t, ..., sideband harmonics with time arguments nt + kh, where n =1, 2,..., k = ±1, ±2,...,andh is the longitude of the mean Sun. Thus, there is a large annual variation in the solar radiation incident on the atmosphere at the station, and since the amplitudes of the solar lines are affected by solar radiation, both through the solar atmospheric tide and the changing conductivity of the ionosphere, the annual radiation change produces these sidebands of Sn. This implies that there is a large alteration in the solar daily variation with season. Sidebands of the solar diurnal and semi-diurnal lines, created by the 11-yr solar cycle modula- tion, are hardly resolved in the high-resolution spectra of Figures 8, 9, 10 and 11 because data ranging over only four solar cycles are used. The investigation of small periodic variations with frequencies in the neighbourhood of the lunar lines is particularly enigmatic as it involves the identification of weak lines, clustered in the vicinity of Ln and yet in the presence of relatively strong noise. The associated spectral peaks in Figures 8, 9, 10 and 11 are distinctly above the noise level, although the amplitudes of these constituents in the geomagnetic spectrum are small. Since samples from all parts of the solar cycle are present in the 40-yr observing period, it is evidently incorrect to hypothesize that the signal is stationary from year to year. The spectral curves must therefore be interpreted as average spectra over almost four sunspot cycles. The two dominating lines in Figure 9 for the SD-band are at 2 cpd and 1.93227 cpd and are the solar and lunar semi-diurnal waves S2 and L2, respectively. The dynamo action of the L2 component in the magnetic D-record is consequently indubitable; its amplitude at Dourbes is about 0.15’. Since there is an annual change in the mean ionospheric conductiv- ity, splitting of the lunar semi-diurnal line is expected. The triplet of peaks at the frequencies (1.92954,1.93227,1.93501) cpd in Figure 9 is exactly at the frequency of L2 and its annual side- bands and they are accordingly denoted by (−1L2,L2,1L2). This is clear evidence for an annual variation in the L2 lunar tide. Apparently the splitting of L2 by the semi-annual variation is too weak to come clearly out of the background noise. A minor term is also expected in the diurnal band at the frequency of the lunar wave L1. There is an indication in Figure 8 of a triplet of lines at the frequencies (0.92954,0.93227,0.93501) cpd in the vicinity of the wave L1. These lines can be due merely to noise fluctuations in the spectrum, but the peaks are located exactly where the dynamo term L1 and its annual side- bands would be expected. For this reason the lines are denoted by (−1L1,L1,1L1). Together these spectral peaks in Figures 8 and 9 almost certainly represent splitting of the lunar diurnal and semi-diurnal lines, L1 and L2, by an annual modulation mechanism, since they are not removed by the CLEANing of the spectrum. In general, the sidebands of Ln are denoted by kLn, n =1, 2,...; k = ±1, ±2,.... 34

The solar and lunar daily variation models satisfactorily explain all lines in the Dourbes declination spectrum. The frequency 1/27 = 0.037 cpd associated with the sidebands of the 1 cpd peak in Figure 8 at the frequencies (1 + k/27) cpd, k = −1, 1, 2, 3, suggests a mechanism related to solar rotation (Black, 1970; De Meyer, 1980). Owing to the Sun’s rotation of approx- imately 27 days there is a recurrence for magnetic storms and disturbances to repeat after a solar rotation and the magnetic variations with a period near 27 days are caused by the fluctu- ations of the ring current situated in the Earth’s proton belt. A strong ∼27-day periodicity in magnetic activity at the station is well established (De Meyer and De Vuyst, 1982). Since there is an increase in the solar diurnal magnetic variation with increasing magnetic activity, there will be a 27-day period amplitude modulation of the solar diurnal magnetic variation. The sidebands of Sn at the frequencies (n + k/27) cpd, n =1, 2,...; k = ±1, ±2,..., due to the modulation by the 27-day recurrences and their harmonics, are denoted by kRn.Harmon- ics of the 27-day solar rotation effect would produce modulation lines of S1 at the frequencies (1 ± k/27) cpd, k =1, 2,.... Such lines are seen in Figure 8 at frequencies above 1 cpd, but the asymmetry with respect to the frequency of S1 is not predicted by a model requiring low- frequency modulation. The solar rotation presumably influences the solar diurnal magnetic variation by the same mechanism as the sunspot cycle, i.e., through a change in ionospheric conductivity. The 1 cpd line in the spectrum is mainly generated by the interaction of the solar diurnal tide with the mean ionospheric conductivity; the (1 ± k/27) cpd sidebands therefore imply a 27-day period modulation of the mean ionospheric conductivity which is attributed to the recurrence tendency of magnetic storms. The width of the 27-day modulation bands of S1, approximately 0.007 cpd, is due to the finite number (up to about six) of recurrences of a given magnetic storm before it disappears, and the irregular occurrence of solar active regions that are the cause of these storms. This large bandwidth contrasts with the width of the solar diurnal peak of 2∆f =1.3710−4 cpd, which is the natural width of a sinusoidal signal in the spectrum for the present length of data, and sub- stantiates the view that these sidebands arise from the broad-band solar rotation period. This broad-band feature of the spectrum is linked by Currie (1974) and Delouis and Mayaud (1975) to both latitude variation and proper motion of sunspots over the Sun’s disk. One implication is that the 27-day amplitude modulation contributes to the non-stationary time variation in amplitude of the major luni-solar terms. If this is the true mechanism, then similar sidebands of the 2 cpd line S2 are also expected. However, no prediction of the relative magnitudes of the two sets of sidebands can be made, so that the apparent absence of the sidebands at (2 ± k/27) cpd in Figure 9 does not disprove the solar rotation mechanism. The lunar semi-diurnal magnetic variation L2 arises from the interaction of the lunar semi-diurnal tide and the mean ionospheric conductivity, and so this lunar line would be expected to show sidebands at (1.93227 ± 1/27) cpd. However, the 27-day sidebands of the 1 cpd line S1 have an amplitude less than 1/10 that of the 1 cpd line itself, and lines 1/10 of the amplitude of the lunar peak L2 of circa 0.15’ would be completely hidden by the background noise. It is also evident that the modulation of the solar lines Sn by the annual variation is by far more important than their modulation by the 27-day period signal. The resolution power of a 40 year observing interval becomes so great that any stable line in the spectrum would suppose a phenomenon tuned on a very pure frequency in order for a stable line to stand out. No such phenomenon exists on the Sun and the emitting sources can only produce an increase of the noise in a wide band of frequencies because of the large range of latitudes where they are located. As a consequence we adopt an equivocal view as to the geophysical meaning of the excitation lines in the geomagnetic storm recurrence period band. 35

Figure 12: Clean spectrum of the declination in the frequency interval (0.006) cpd.

Figure 13: Clean spectrum of the declination in the frequency interval (0.006,0.02) cpd.

Figure 12 shows the clean spectrum of the declination in the low-frequency range (0,0.006) cpd. The periods identified by the spectral peaks are indicated in days. A number of small peaks are present in Figure 12. The peak at 9.9 years is evidently associated with the ∼11-year period of solar activity and the lines at 5.0 and 3.6 years are presumably sub-harmonics of the solar cycle variation. Currie (1973) reports a line at 6.07 years and suggests that it may represent an internal signal from the Earth’s core. It is possible to link the spectral peak at 7.3 years with the oscillation period of 6.7 years of one of the free modes of the electromagnetically coupled core-mantle Earth system (Yukutake, 1972). The annual and semi-annual peaks are characteristic features of the spectrum in Figure 12. Harmonics of the annual variation appear in the clean spectrum in Figure 13, but the spectrum has little structure in the range from 50 to 125 days. It is tempting to relate the quasi-biennial line near 2.6 years in Figure 12 with the roughly 26-months zonal wind oscillation in the equatorial stratosphere (Reed et al., 1961). A significant line at 1.9 years in the aa index has been identified by Delouis and Mayaud (1975). There is also a distinct 1.3-yr line in the spectrum of Figure 12 that may not be related to a source on the Sun. An indication of a similar line in the Ap spectrum was reported by Fraser-Smith (1972) andintheaa index by Delouis and Mayaud (1975). An unexpected line occurs at a period of about 426 days, which may be associated with the Chandler wobble period of about 430 days 36

(Melchior, 1982). The pole tide signal has been identified by Rao and Rangarajan (1978) at a low-latitude station (Alibag, geographic latitude 18o 38’ N). It is interesting to examine whether the multiples (426,328) days, (479,296) days and (631,263) days could be produced by splitting from a single fundamental line. If amplitude modulation is involved in the form A cos(2πfmt + ϕm)cos(2πfct + ϕc), where fc =1/Tc is the carrier frequency and fm =1/Tm is the modulation frequency, the original line at the centre fre- quency fc would be accompanied by a doublet consisting of the side frequencies (fc−fm,fc+fm) and amplitude A/2. Using the pair (426,328) days and solving for fc and fm, we find that the centre frequency fc corresponds with a period of Tc = 371 days and a modulation period of Tm = 7.8 years. The other two pairs give Tc = 366 days and Tm = 4.2 years, and Tc = 371 days and Tm = 2.5 years, respectively. The amplitude of these modulation lines is about 0.05’, which is about half of the amplitude 0.12’ of the annual wave. An amplitude modulation of the annual component by a cyclic variation with period of about 8 years and its harmonics could be inferred from this result. The two major theories that have been advanced to explain the semi-annual variation are commonly known as the ‘axial’ and ‘equinoctial’ (or geocentric) hypotheses (Currie, 1966). The axial theory attributes its source to the variation of the heliographic latitude of the Earth, that is, by changes in the Earth’s heliocentric position, with main activity occurring just after the times of maximum (+7.2o) and minimum (–7.2o) latitude, i.e., September 7 and March 6, respectively. The equinoctial hypothesis relates the 6-months variation to changes of the an- gle between the dipole axis and the Earth-Sun line, with maximum activity occurring at the equinoxes (passages of the Sun across the Earth’s equator occurring on March 21 and Septem- ber 23). Currie (1966) argues that it is very unlikely for both the annual and semi-annual lines to be entirely generated by the same mechanism. Currie supports the theory that meridional iono- spheric winds blowing continuously from the summer to the winter hemisphere are the origin of the annual line. From this point of view the annual variation is attributed to an ionospheric dynamo mechanism in the sense that it is a seasonal effect caused by amplitude modulation in solar radiation received by the summer and the winter hemisphere. On the other hand, Currie proposes a fundamentally different mechanism for the semi-annual line and suggests that it is due to fluctuations in the intensity of the ring current having its seat in the quiet time proton belt at approximately 3.5 Earth radii. On the contrary, the source of the annual (both the 12-months wave and its second har- monic) variation may be primarily the mechanism identified by Malin and Isikara (1976), so that it results from an annual variation of the mean latitude of the ring current situated in the proton belt of the Earth, while the intensity of the ring current itself remains constant in the course of a year. It would move north at the December solstice (December 22) when the solar wind comes from the south and would move south at the June solstice (June 21). In the northern hemisphere the ring current effect would be greater at the December solstice than at the June solstice. The fine structure of the 27-day solar rotation band in the spectrum of D is shown in Figure 14. The curve reveals a multiplet of small peaks in the vicinity of the average solar rotation period of 27.27 days but no prominent line with a high signal-to-noise ratio can be indicated. Peaks near 27, 13.5 and 9 days are clearly visible in Figures 14 and 15. The high-frequency part of the spectrum in Figure 16 relates to the noise contributions in the declination data. 37

Figure 14: Clean spectrum of the declination in the frequency interval (0.02,0.05) cpd.

Figure 15: Clean spectrum of the declination in the frequency interval (0.05,0.15) cpd.

Figure 16: Clean spectrum of the declination in the frequency interval (0.15,0.5) cpd. 38

The magnetic variations with periods near 27 days are mainly ascribed to the tendency for magnetic storms and disturbances to repeat after a solar rotation and they are caused by the fluctuations in the intensity of the ring current situated in the proton belt. The presence of the harmonics of the ∼27-day variation results from the fact that the repeating unit cannot be described with a pure sine wave. Instead of a sharp line near the average solar rotation period of 27.27 days, a broad signal with periods ranging from 23 to 42 days is observed in Figure 14, which points to the pseudo-periodic nature of the phenomenon. This feature of the spectrum may be linked with the latitude variation and the proper motion of sunspots over the Sun’s disk (Delouis and Mayaud, 1975). Since the signal-to-noise ratio in the spectral peaks of Figure 14 is typically of the order of 2 it is obvious that there is no apparent reason to conclude that they have a clear physical meaning. These lines may be thought to be created by noise fluctuations. One possible cause is the variation of the ∼27-day geomagnetic storm recurrence period over a solar cycle. Therefore, searches for geophysically significant peaks in the broad-band signal near the mean solar rotation period produce controversial results. For instance, there is no trace of a distinct line at the average rotation period. There is a very weak line near 29.5 days, which may correspond to the 29.53-day lunar synodic period.

Figure 17: Annual means of the declination (Dourbes).

Figure 18: Clean spectrum of annual means of the declination. 39

Although earlier data must be considered to be somewhat unreliable (De Vuyst, 1976), the annual means from 1828 to 1999 in Figure 17 for the station of Dourbes are analyzed as a whole in an attempt to find significant long-period behaviour. The majorities of the long-term contributions are reduced by removing a least squares line (with a slope of 7.44 minutes of arc per year) from the annual means. This procedure acts as a high-pass filter and it is implic- itly assumed that this linear trend has a physical meaning, being associated with the secular variation of internal origin. Nevertheless, it is emphasized that removal of any trend from the observations can have a drastic effect on the very-low frequencies. Inspection of the clean spectrum of the annual means in Figure 18 demonstrates a dominant variation at ∼98 years. The estimated amplitude of about 30 minutes of arc of this broad line is too large to associate this variation with a source on the Sun. So it is concluded that it may be due to an internal signal generated in the Earth’s core. Currie (1973) shows the tendency for spectral lines in the geomagnetic records of 49 observatories to cluster near 60 years and suggests a causal relationship between the ∼60-yr magnetic signal and variations of the length of the day. Furthermore, Braginsky (1970) has shown theoretically that torsional oscillations of the hydromagnetic dynamo may be responsible for fluctuations in the length of the day with a nearly 60-years periodicity. However, all identifications of a ∼60-yr period in geomagnetic observations are based on a data span of about 100 years or less, which makes its identification as a periodic phenomenon tentative at best. For the short data interval concerned of 172 years it is difficult to say whether the ∼98-yr spectral peak in Figure 18 represents a truly periodic or cyclic fluctuation, rather than a transient pulse, although it is highly improbable that a solar excitation mechanism could generate this relatively strong magnetic variation. The peak at about 34 years could be a harmonic of the fundamental 98-yr variation. Currie (1973) also finds a cluster of spectral peaks around a period of 21.4 years and calls it the double solar cycle (DSC) since it corresponds to the nominal 22-yr basic polarity cycle (Hale cycle) of the Sun. However, only 17 out of 49 observatories examined by Currie showed this DSC peak. Alldredge (1977) presents evidence for the existence of variations with periods in the 22-yr to 28-yr range and he associates them with a hydromagnetic source in the Earth’s core. Figure 18 detects a small variation with a ∼22-yr period, but no clear solar cycle contri- bution of ∼11 years (frequency ∼0.1 cycles/year) is apparent.

7. Conclusions

Many spectral estimation techniques are based upon modeling of the data by a small set of parameters (Childers, 1978; Kay and Marple, 1981). When the model is an accurate represen- tation of the data, spectral estimates can be obtained whose performance exceeds that of the classical periodogram or autocovariance spectral estimator. The improvement in performance is manifested by higher resolution and a lack of sidelobes of the spectral window resulting from the finite data span. It is emphasized that in addition to an accurate model of the observations, one must base the spectral procedure on a good estimator of the model parameters. Usually, this entails a maximum likelihood parameter estimation. However, if the model is inappropriate, biased spectral estimates may result. If the model is accurate but a poor statistical estimator of the parameters is employed, inflated variance spec- tral estimates will also result. Computationally efficient procedures for maximum likelihood spectral estimation generally do not require substantially more calculation than conventional Fourier spectral estimators. Nevertheless, these techniques involve a selection of the number of model parameters and the solution of non-linear equations. The most obvious, and frequently applied, technique for finding periodicities in the data is 40 simply to use the frequency peaks in the periodogram. The ability to discriminate neighbouring peaks in a spectrum is a function of the intrinsic resolution of the spectrum which results from a discrete data set, that is, the reciprocal of the time interval of the available observations, and the signal-to-noise ratio in the spectrum. The second consequence of a finite data set occurs because the data are not continuous over the observing interval, but take place at discrete time points which may be non-uniformly spaced, and usually contain missing observations. A logical scheme for the deconvolution of a frequency spectrum would consist of the fol- lowing steps (De Meyer, 1981): (1) estimate the properties of a harmonic component from the highest peak in the dirty spectrum, (2) determine the frequency, amplitude, and phase from a least squares fit to the data for frequencies near the estimate, (3) subtract the contribution of that harmonic from the data, (4) transform the resulting time spectrum into a new spectrum, and, (5) iterate to convergence. Although, in principle, equivalent to the CLEAN methodology, this technique is apparently more time consuming since it requires the computation of a FFT spectrum at each iteration. Moreover, there is a danger of identifying spurious spectral peaks. Suited to our purposes, the CLEAN algorithm consists of identifying and removing the con- tributions of each periodic component from the peaks of the dirty Fourier spectrum. This is accomplished by successively destroying the effect of the sidelobes of the periodic constituents from the dirty spectrum at all frequencies until nothing is left but noise. Although CLEAN does not actually enhance the resolution of a spectrum, it can make it a great deal simpler to separate close components since it removes the confusion of sidelobes created by the intrinsic window function. We have shown that the CLEAN procedure can be successfully applied to the problem of deconvolving the Fourier spectrum obtained from sampled data, within the resolution lim- its resulting from the finite time span of the observations. The method should be reliable for spectra consisting of a number of frequency components that is much less than the number of data samples at hand. CLEAN appears to be an adequate technique suitable for the spectral analysis of irregularly sampled, deterministic time series for which the true spectrum may be multiply periodic or quasi-periodic. A subroutine was written in reasonably efficient (although not ‘optimal’) FORTRAN, and the FFT algorithm was implemented using a library subroutine (Briggs and Henson, 1995).

Acknowledgments

Thanks are due to Dr. P. Termonia and Dr. H. De Backer of the Royal Meterological In- stitute for valuable comments.

References

Alldredge, L. R., 1977. Geomagnetic variations with periods from 13 to 30 years, J. Geomag. Geoelectr., 29, 123–135. Black, D. I., 1970. Lunar and solar magnetic observations at Abinger: Their detection and es- timation by spectral analysis via Fourier transforms, Phil. Trans. R. Soc., 268 A, 223–263. Blackman, R. B. and Tukey, J. W., 1959. The Measurement of Power Spectra, Dover, New York. Bloomfield, P., 1976. Fourier Analysis of Time Series: An Introduction, John Wiley, New York. Bracewell, R. N., 1986. The Fourier Transform and its Applications, 2nd ed., McGraw-Hill, New York. Braginsky, S. I., 1970. Oscillation spectrum of the hydromagnetic dynamo of the Earth, Geo- 41 magnetism i Aeronomy, 10, 172–181. Briggs, W. L. and Henson, V. E., 1995. The DFT: An owners manuel for the discrete Fourier transform, Society of Industrial and Applied . Brigham, E. O., 1974. The Fast Fourier Transform, Prentice-Hall, Englewood Cliffs, New Yersey. Childers, D. G., 1978. Modern Spectrum Analysis, IEEE Press selected Reprint Series, New York. Cooley, J. W. and Tukey, J. W., 1965. An algorithm for machine calculation of a complex Fourier series, Math. Comput., 19, 297–301. Cordell, L. and Grauch, V. J. S., 1982. Reconciliation of the discrete and integral Fourier trans- forms, Geophysics, 47, 237–243. Currie, R. G., 1966. The geomagnetic line spectrum – 40 days to 5.5 years, J. Geophys. Res., 71, 4579–4598. Currie, R. G., 1973. Geomagnetic line spectra – 2 to 70 years, Astrophys. Space Sci., 21, 425– 438. Currie, R. G., 1974. Harmonics of the geomagnetic annual variation, J. Geomag. Geoelectr., 26, 319–328. Delouis, H. and Mayaud, P. N., 1975. Spectral analysis of the geomagnetic index aa over a 103–year interval, J. Geophys. Res., 80, 4681–4688. De Meyer, F., 1980. Solar and lunar daily geomagnetic variations at Dourbes, J. Atm. Terr. Phys., 42, 753–763. De Meyer, F., 1981. Mathematical modelling of the sunspot cycle, Solar Phys., 70, 259–272. De Meyer, F., 2001. Modulation of the solar and lunar daily geomagnetic variations, Koninklijk Meteorologisch Instituut van Belgi¨e, Wetenschappelijke en technische publicatie, 15, 38p. De Meyer, F., 2003. A modulation model for the solar and lunar geomagnetic variations, Earth Planets Space, 55, 407–418. De Meyer, F. and De Vuyst, A., 1982. The geomagnetic line spectrum at one station (Dourbes), Ann. G´eophys., 38, 61–73. De Vuyst, A., 1976. Valeurs et distributions des ´el´ements g´eomagn´etiques en Belgique (Cartes pour l’´epoque 1975,0), Institut Royal M´et´eorologique de Belgique, A98, 34p. Dutt, A. and Rohklin, V., 1995. Fast Fourier Transforms for non-equispaced data, II, App. Comp. Harm. Anal., 2, 85–100. Duyndam, A. J. W. and Schonewille, M. A., 1999. Nonuniform fast Fourier transform, Geo- physics, 64, 539–551. Fraser-Smith, A. C., 1972. Spectrum of the geomagnetic activity index Ap, J. Geophys. Res., 77, 4209–4220. Harris, F. J., 1978. On the use of windows for harmonic analysis with the discrete Fourier transform, Proc. IEEE, 66, 51–83. Jenkins, G. M. and Watts, D. G., 1968. Spectral Analysis and its Applications, Holden-Day, San Francisco. Kay, S. M. and Marple, S. L., 1981. Spectrum analysis–Amodernperspective,Proc. IEEE, 69, 1380–1419. Malin, S. R. C. and Chapman, S., 1970. The determination of lunar daily geophysical variations by the Chapman–Miller method, Geophys. J. R. astr. Soc., 19, 15–35. Malin, S. R. C. and Isikara, A. M., 1976. Annual variation of the geomagnetic field, Geophys. J. R. astr. Soc., 47, 445–457. Markel, J. D., 1971. FFT pruning, IEEE Trans. Audio Electroacoust., AU–19, 305–311. Melchior, P., 1982. The Tides of the Planet Earth, Pergamon Press, Oxford. Papoulis, A., 1965. Probability, Random Variables and Stochastic Processes, McGraw-Hill, New 42

York. Parker, R. L., 1994. Geophysical Inverse Theory, Princeton University Press. Rao, D. R. K. and Rangarajan, G. K., 1978. The pole-tide signal in the geomagnetic field at a low-latitude station, Geophys. J. R. astr. Soc.. 53, 617–621. Reed, R. J., Campbell, W. J., Rasmussen, L. A. and Rogers, D. G., 1961. Evidence of the downward propagating annual wind reversal in the equatorial stratosphere, J. Geophys. Res., 66, 813–818. Roberts, D. H., Lehar, J. and Dreher, J. W., 1987. Time series analysis with CLEAN. I. Deriva- tion of a spectrum, Astron. Journ., 93, 968–989. Scargle, J. D., 1982. Studies in astronomical time series analysis. II. Statistical aspects of spec- tral analysis of unevenly spaced data, Astrophys. Journ., 263, 835–853. Schwarz, U. J., 1978. Mathematical-statistical description of the iterative beam removing tech- nique (method CLEAN), Astron. Astrophys., 65, 345–356. Swan, P. R., 1982. Discrete Fourier transforms of nonuniformly spaced data, Astron. Journ., 87, 1608–1615. Thompson, A. R., Moran, J. M. and Swenson, G. W., 1986. Interferometry and Synthesis in Radio Astronomy, John Wiley, New York. Ware, A. F., 1998. Fast approximate Fourier transforms for irregularly spaced data, SIAM Re- views, 40, 838–856. Yukutake, T., 1972. The effect of change in the geomagnetic dipole moment of the rate of change of the Earth’s rotation, J. Geomag. Geoelectr., 24, 19–47.

E-mail: [email protected]

ISSN nr. /D2003/0224/33