Koninklijk Meteorologisch Instituut van Belgi¨e Institut Royal M´et´eorologique de Belgique
Deconvolution of the Fourier spectrum
F. De Meyer
2003
Wetenschappelijke en Publication scientifique technische publicatie et technique Nr 33 No 33
Uitgegeven door het Edit´epar KONINKLIJK METEOROLOGISCH l’INSTITUT ROYAL INSTITUUT VAN BELGIE METEOROLOGIQUE DE BELGIQUE Ringlaan 3, B -1180 Brussel Avenue Circulaire 3, B -1180 Bruxelles Verantwoordelijke uitgever: Dr. H. Malcorps Editeur responsable: Dr. H. Malcorps
Koninklijk Meteorologisch Instituut van Belgi¨e Institut Royal M´et´eorologique de Belgique
Deconvolution of the Fourier spectrum
F. De Meyer
2003
Wetenschappelijke en Publication scientifique technische publicatie et technique Nr 33 No 33
Uitgegeven door het Edit´epar KONINKLIJK METEOROLOGISCH l’INSTITUT ROYAL INSTITUUT VAN BELGIE METEOROLOGIQUE DE BELGIQUE Ringlaan 3, B -1180 Brussel Avenue Circulaire 3, B -1180 Bruxelles Verantwoordelijke uitgever: Dr. H. Malcorps Editeur responsable: Dr. H. Malcorps
1
Abstract
The problem of estimating the frequency spectrum of a real continuous function, which is measured only at a finite number of discrete times, is discussed in this tutorial review. Based on the deconvolu- tion method, the complex version of the one-dimensional CLEAN algorithm provides a simple way to remove the artefacts introduced by the sampling and the effects of missing data from the computed Fourier spectrum. The technique is very appropriate in the case of equally spaced data, as well as for data samples randomly distributed in time. An example of the application of CLEAN to a synthetic spectrum of a small number of harmonic components at discrete frequencies is shown. The case of a low signal-to-noise time sequence is also presented by illustrating how CLEAN recovers the spectrum of the declination component of the geomagnetic field, measured in the magnetic observatory of Dourbes.
1. Introduction
Synthesis maps in radio astronomy are generated by Fourier transformation of interferome- ter visibility data to produce a map of the sky (Thompson et al., 1986). Because only a finite number of visibility points are gathered, the problems of sampling apply to interferometer maps. The CLEAN algorithm is widely used in two-dimensional image reconstruction and performs an approximate deconvolution of the ‘true map’ of the sky from the ‘dirty map’ procreated from the data, in essence, removing the false features introduced inherently by the finite sampling. By adapting the two-dimensional CLEAN procedure developed for use in aperture synthesis, Roberts et al. (1987) discussed a rather intuitive way of handling the difficulties associated with the spurious apparent responses that arise from the incompleteness of the discrete sampling of a continuous time function. A multitude of spectral estimation algorithms for discrete time series have been proposed (Kay and Marple, 1981). Comparisons among various competing techniques have been based on limited computer simulations, which can be misleading. Here we select the conventional Fourier spectral estimator because the effects of finite, discrete sampling on the performance of the classical periodogram are precisely known and can be almost completely rectified. Estimation of the power spectral density, or simply the spectrum, of discretely sampled deterministic and stochastic processes is usually based on procedures employing the Fast Fourier Transform or FFT (Brigham, 1974), which implicitly assumes that the data are periodic outside the observ- ing interval. Unfortunately the FFT technique is directly applicable only when the input data sequence is evenly spaced in time. Often observations cannot be controlled to the extent that observability constraints may cause an unequally spaced sample domain. One is then forced to extract spectral information from these irregularly sampled data. The FFT approach to spectrum analysis is computationally efficient and produces reason- able outcomes for a large class of signal processes (Briggs and Henson, 1995). In spite of these advantages, there are several inherent performance limitations of the FFT procedure. The most prominent shortcoming is that of frequency resolution, i.e., the ability to distinguish the spec- tral responses of distinct signals. The frequency resolution is roughly the reciprocal of the time interval over which sampled data are available. A second imperfection is due to the implicit windowing of the data that occurs when processing with the FFT. Windowing manifests itself as ‘leakage’ in the spectral domain, that is, energy in the main lobe of a spectral response ‘leaks’ into the sidelobes, obscuring and distorting other spectral responses that are present. In fact, weak signal spectral responses can be masked by higher sidelobes from stronger spectral com- ponents. These two performance limitations of the FFT approach are particularly troublesome when analyzing short data records. 2
In an attempt to alleviate the inherent shortcomings of the FFT approach, many alternative spectral estimation procedures have been proposed. For instance, Kay and Marple (1981) discuss the improvement that may result from non-traditional methods such as the autoregressive (AR) method or the maximum entropy method (MEM). Claims have been made concerning the de- gree of improvement obtained in the spectral resolution and the signal detectability when these numerical techniques are applied to numerical data. These performance advantages, though, strongly depend upon the signal-to-noise ratio (SNR) of the input data, as might be expected. In fact, for low enough SNR’s the ‘modern’ spectral estimators are often no better than those obtained with conventional FFT processing. Even in those cases where improved spectral fidelity is achieved by use of a different spectral estimation procedure, the computational requirements of that alternate method may be substantially higher than FFT computation. This may make some spectral estimators unattractive for real-time implementation. Schwarz (1978) has shown that CLEAN is a statistically correct method of least squares fit- ting sinusoidal functions to the observations which is the conventional framework of the discrete Fourier transform. Fourier inversion of a finite representation of a continuous function leads to a frequency spectrum distorted by (1) the limited frequency resolution due to the finite time span of the data sample, and (2) spurious apparent responses that are caused by the incompleteness of the sampling. There is an intuitive way of handling the difficulties associated with the latter effect. With a view to practical applications heuristic arguments are used in this discussion to make the discourse more easily accessible. The CLEAN algorithm performs, in essence, a non-linear deconvolution of the Fourier spectrum on the frequency axis, equivalent to a least squares interpolation in the time domain. The numerical method is particularly suitable for functions whose spectra are dominated by a few harmonic components at discrete frequencies. The CLEAN technique for time series spectral analysis (Roberts et al., 1987) works by subtracting from the noisy or ‘dirty’ spectrum, which is the convolution of the ‘true’ spectrum with the ‘dirty beam’ generated by the finite number of data, the response expected from a sinusoidal component with frequency corresponding to the maximum of the spectrum. In this way a residual spectrum is produced from which both the component and the spurious features due to sampling have been removed. This procedure of ‘CLEANing’ the spectrum is repeated on successive residual spectra until nothing is left but noise. The set of resulting ‘clean com- ponents’ is then used as a model for the time function. This model is convolved with a ‘clean beam’ which has the same resolution as the original ‘spectral window’ but no sidelobes. The convolution of the clean components model with the clean beam serves to weight down the interpolated information at time instants outside those sampled. Finally, to preserve the noise level of the spectrum and to incorporate any components which were not resolved by CLEAN, the final residual spectrum is added to the convolved clean components to produce the ‘clean spectrum’. The one-dimensional version of the CLEAN deconvolution technique is especially useful for spectral analysis of unequally spaced observations and provides a simple way to dispose of the artefacts of missing data, a situation which is frequently encountered in practical time series. As a first example, the application of CLEAN will be illustrated by means of the frequency analysis of a synthetic spectrum, that is generated by a small number of harmonic components at discrete frequencies. The method is also used in the search for periodic variability in the time series of the hourly values of the magnetic declination, measured in the observatory of Dourbes, Belgium, which is characterized by a spectrum having a low signal-to-noise ratio. 3
2. The Fourier spectrum of a continuous function
The basic recourse for the frequency analysis of an aperiodic or transient function is the Fourier integral (Bracewell, 1986). Here we follow the notation of Jenkins and Watts (1968). The spec- trum of an analog function x(t), known for all time t, is given by the continuous Fourier transform (denoted by the operator F)ofx(t), ∞ X(f) ≡F{x(t)} = x(t) e−2πift dt, −∞≤f ≤∞, (1) −∞ which defines the contribution of each frequency f to x(t). The inverse Fourier transform (de- noted by F −1) is defined by the Riemann integral ∞ x(t) ≡F−1{X(f)} = X(f) e2πift df , −∞≤t ≤∞. (2) −∞
Note that the Fourier operator F symbolises a linear transformation of the time axis onto the frequency axis and vice versa. If we assume here, as elsewhere, x(t) to be real, X(f) is in general a complex function of frequency and the Fourier transform satisfies the symmetry relationship
X(−f)=X∗(f), (3) where the ‘∗’ represents complex conjugation. A rigourous demonstration shows that the validity of the integral (1) requires that x(t) should be absolutely integrable: ∞ |x(t)| dt < ∞. (4) −∞
Suppose that x(t) is a zero-mean continuous time function. Parseval’s theorem for Fourier transforms, expressed as ∞ ∞ [x(t)]2 dt = |X(f)|2 df , (5) −∞ −∞ is a statement of the conservation of energy: the energy of the time domain signal is equal to the energy of the frequency domain transform. Note that the assumption of finite signal energy is a sufficient, but not a necessary condition for the existence of the Fourier transform (1). This leads one to define the power spectrum P (f)ofx(t) as the squared modulus of its Fourier transform: P (f)=|X(f)|2 , −∞≤f ≤∞. (6) Thus, P (f) is an energy spectral density in that it represents the distribution of energy as a function of frequency. Since x(t) is supposed to be real, half of the power at a given |f| occurs at the negative frequency −f. Writing P (f)=X(f)X∗(f) it is easily shown from equation (1) that ∞ P (f)= γ(τ) e−2πifτ dτ, (7) −∞ defining the power spectrum as the Fourier transform of the autocorrelation function γ(τ)of x(t): ∞ γ(τ)= x(t) x(t + τ) dt, −∞≤τ ≤∞. (8) −∞ 4
The Cauchy-Schwarz inequality for integrals states that ∞ 2 ∞ ∞ ∞ 2 x(t) x(t + τ) dt ≤ [x(t)]2 dt . [x(t + τ)]2 dt = [x(t)]2 dt . (9) −∞ −∞ −∞ −∞
Therefore the autocorrelation function is maximum at the origin:
|γ(τ)|≤γ(0) for all τ. (10)
Using the Parseval theorem (5) it is clear that ∞ ∞ γ(0) = [x(t)]2 dt = P (f) df . (11) −∞ −∞
It follows that the autocorrelation function of x(t) exists if the total energy γ(0) of x(t) is finite. For this reason the Fourier transform will only be applied to square integrable (deterministic or transient) signals with finite energy. This implies that x(t) must approach zero for large values of t to prevent the total energy to become infinite. A different viewpoint must be taken when x(t) is a wide sense stationary, stochastic pro- cess rather than a deterministic, finite-energy waveform (Papoulis, 1965). The energy of such processes is usually infinite, so that the quantity of interest is the time average of energy distri- bution with frequency. Also, integrals such as (1) normally do not exist for stochastic processes. For the case of stationary random processes, the autocorrelation function provides the basis for spectrum analysis, rather than the random process itself (Blackman and Tukey, 1959). An ad- ditional assumption often made is that the stochastic process is ergodic in the first and second moments. This property permits the substitution of time averages for ensemble averages. Here we will deal only with deterministic signals since we want to keep the discussion as simple as possible without aggravating unnecessarily the mathematical representations. A concept of great importance both in theoretical work and physical application is the convolution integral ∞ x(t) ⊗ y(t)= x(t ) y(t − t ) dt, (12) −∞ where x(t)andy(t) are transient functions with Fourier transforms X(f)andY (f), respectively. By a change of variable it is easy to see that x(t) ⊗ y(t)=y(t) ⊗ x(t) whenever the associated integrals converge. Then, the convolution theorem for Fourier transforms states that the Fourier transform of the convolution x(t) ⊗ y(t) is the product of the Fourier transforms of x(t)and y(t). In operation notation, the convolution theorem is expressed as follows:
F{x(t) ⊗ y(t)} = F{x(t)}. F{y(t)} = X(f) .Y(f). (13)
Also note that x(t) ⊗ y(t)=F −1{X(f) .Y(f)}, (14) which means that the convolution of two functions and the product of their Fourier transforms are Fourier pairs. The principal utility of Fourier theory is that certain operations that are difficult on the time axis become very simple in terms of frequency. For instance, in the frequency domain the convolution of x(t)andy(t) reduces to a multiplication of the Fourier transforms of the two original functions. The convolved response is then obtained as the inverse Fourier transform of this product. The simplicity and convenience of this result is the main reason for the extensive use of Fourier techniques in geophysics. Suppose that we have measured a finite portion of a continuous trace which is subjected to a sampling procedure with constant time interval ∆t. The impression might arise that greater 5 detail in the Fourier spectrum may be expected by processing more data in a fixed interval of finite length T , thus by letting ∆t → 0, T = constant. Fortunately this is not true so that we are not forced to refine constantly our measurements without end in order to obtain stable spectral estimates. There is indeed a duality between time, whose dimension is [t], and frequency, whose dimension is 1/[t]. A precise determination of a signal in frequency requires a long time lapse for the time function, i.e., T →∞, and conversely, a precise determination in time demands a broad frequency band. This well-known fact is now demonstrated formally. We assume that only a finite portion of a real, zero-mean, continuous record x(t) is known in the interval [0,T] and take x(t) to be definitely zero outside this interval. It is further supposed that x(t) is square integrable with variance