<<

JANUARY 1999 DABAS 19

Semiempirical Model for the Reliability of a Matched Filter Frequency Estimator for Doppler Lidar

ALAIN DABAS MeÂteÂo-France, Centre National de Recherches Meteorologiques, Toulouse, France (Manuscript received 10 February 1997, in ®nal form 22 December 1997)

ABSTRACT The author proposes a heuristic semiempirical model for predicting the reliability of a matched-®lter frequency estimator applied to Doppler lidar . The model is tuned by a single coef®cient ␤ empirically related to the ratio of the number of samples per estimate over the number of speckles. It can deal with any signal characteristics (spectrum width, number of samples, etc.) as well as any factor of accumulation.

1. Introduction a signi®cant number of bad measurements. Either way, the controlled dataset is useless. It follows that a high Due to the limits imposed by the available technology reliability is required for measured data. For a space- and the weak backscatter coef®cient of the atmopshere, borne application, for instance, the data speci®cation coherent atmospheric Doppler lidars often operate at a typically ranges from 0.5 to 0.95, or even better (Kavaya low signal-to- ratio (SNR). Under such conditions, 1995; Marini and Culoma 1995; Baker et al. 1995; ESA Doppler lidars may occasionally produce ``bad'' mea- 1996). This is a major constraint for the design of the surements, which are characterized by a uniform dis- Doppler lidar. To treat it properly, a precise knowledge tribution over the search band. Such bad measurements of how frequency estimators behave at low SNR regimes originate from the noise generated by random peaks in is required. To answer this need, several studies have the signal that are mistakenly interpreted been conducted. In Frehlich and Yadlowsky (1994), ge- by the frequency estimator as atmospheric echoes. Not neric formulas are provided for predicting the proba- only are they deprived of any useful information on the bility of bad measurements as a function of SNR. The dynamics of sounded atmospheric volumes but they also formulas contain three parameters that depend on the can be very dangerous for the meteorological analysis frequency estimator and vary as a function of the signal in charge of exploiting the measurements for scienti®c characteristics. The parameters are determined empiri- purposes (for instance, weather prediction). Even in cally and tabulated. Though the model is accurate and small numbers, the bad measurements can result in large useful, it has several limitations. First, it cannot work errors in retrieved meterological ®elds. Dedicated pro- when the signal characteristics are outside the range cedures of quality control (QC) are indeed implemented resolved by Frehlich and Yadlowsky since the tuning in analysis systems to avoid this. Based on a contextual parameters cannot be extrapolated from the table. Also, approach (measurements that are too different from all though accumulation is considered by Frehlich and Yad- others in the neighborhood are removed) or related to lowsky (1994), no model parameters are given for it. the frequency estimator itself (Rye and Hardesty 1997), Yet, accumulation is often considered in lidar projects the QC procedures can detect a signi®cant fraction of because it is one of the most ef®cient ways to improve the bad measurements. But their performances are nev- accuracy (Rye and Hardesty 1993a,b); moreover, high ertheless limited: on the one hand, the QC procedures repetition rates are now available with solid-state lasers, leave some of the bad measurements undetected and on thus permitting one to accumulate a large number of the other hand, remove some ``good'' measurements. If signals in short periods of time (Frehlich et al. 1997). the original dataset contains too much bad data, the QC The impact of accumulation on reliability is studied spe- procedure either removes almost all of the data or leaves ci®cally in Frehlich (1996). Scaling (power) laws are provided, giving the dependence on the accumulation factor of the SNR threshold that is required to achieve a given reliability (0.9 and 0.5 in the paper). However, Corresponding author address: Dr. Alain Dabas, MeÂteÂo-France, CNRM/GMEI/LISA, 42, av. G. G. Coriolis, F-31057 Toulouse, Ced- the parameters necessary for the computation of the ex, France. reliability as a function of SNR are not given, preventing E-mail: [email protected] extrapolation of other reliability thresholds.

᭧ 1999 American Meteorological Society

Unauthenticated | Downloaded 09/25/21 05:54 PM UTC 20 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 16

This paper proposes a heuristic model predicting the Under the previous assumptions, the signal power p probability of Levin's (1965) matched ®lter to make bad averaged over a range gate follows approximately a chi- measurements. The input parameters are the SNR, the square distribution (Goodman 1985): main signal characteristics (spectrum width and length), mpmmϪ1 p and the accumulation factor. The model is derived from prob(p) ϭ exp Ϫm , (1) a heuristic analysis of the mechanism responsible for ⌫(m)(S ϩ N)m ΂΃S ϩ N the bad measurements. It is tuned by a single coef®cient where S and N are the mean powers of the atmospheric related to the input parameters. It thus can adapt to any return and noise, respectively, and m is given by set of signal characteristrics. In principle, it is restricted to Levin's frequency estimator, but since all ``best-per- 11 122 1S/N forming'' estimators are nearly equivalent with regard ϭϩ. (2) mM1 ϩ S/N ␮ 1 ϩ S/N to reliability (see Frehlich 1996), it should be possible ΂΃΂΃ to use it for other estimators. Nevertheless, it gives use- Here, M is the number of signal samples contained in ful information on the reliability one must expect for a the processing gate. An interesting mathematical prop- given lidar system under speci®ed conditions. erty of the chi-square distribution (1) is The paper begins in section 2a with a presentation of (S ϩ N)2 the assumptions underlying the model and a simpli®- m ϭ , (3) ␴ 2 cation of the signal periodogram. A formula giving the p probability of a bad estimate follows in section 2b. It which provides a useful experimental way for measuring contains an unknown parameter denoted ␤, whose value m consisting of the estimation of the ®rst- and second- is determined empirically in section 3c for various signal order moments p (Ancellet et al. 1989). characteristics. An empirical formula is derived after- With no noise (N ϭ 0), m reduces to ␮, so ␮ is the ward to relate ␤ to the only one-signal parameter iden- fractional variance of speckle-induced power ¯uctua- ti®ed as relevant: the ratio of the number of samples tions. It is therefore often called the number of speckles per estimate M to the number of speckles ␮ (section and can be interpreted as the number of independent 3d). The model capacity to predict reliability outside realizations of the speckle effect inside the range gate. the range of processing conditions examined by Frehlich It can be related (Rye 1995) to the normalized auto- and Yadlowsky is investigated in sections 3e and 3f. correlation (AC) function of the atmospheric echo ␥(t) ϭ E[s*(x)s(x ϩ t)]/S by M 2. Semiempirical model ␮ ϭ . (4) MϪ1 k 1 ϩ 21Ϫ |␥(k)|2 a. Simpli®ed model for the periodogram k͸ϭ1 ΂΃M The basic assumptions made throughout the article An important consequence for lidar applications is that are 1) lidar signals form a Gaussian process, 2) they are ␮ can be considered a system parameter because the AC function is primarily related to the transmitted laser stationary, and 3) they are contaminated by . pulse (Churnside and Yura 1983), with wind and at- These are usual assumptions in the or lidar signal mospheric inhomogeneities generally playing a minor processing studies. The Gaussian nature of the process and negligible role. is well veri®ed in reality because the signal results from In the frequency domain, the frequency components the addition of a great number of backscattered ``wave- of the periodogram are exponentially distributed. Due lets.'' Stationarity, however, is never perfectly met be- to the ®nite length of the range gate, they are correlated. cause the signal power decreases with range and the According to Zrnic (1980), the correlation can be rather optical and dynamic properties of the atmosphere are high (50% or more)Ðespecially when working with never purely homogeneous. But, nevertheless, station- short range gatesÐbut it decreases with noise. Since arity can be approximated when the measurements are bad measurements arise at low SNRs, in the following, made at long ranges (then the range dependence of the we make the basic assumption that the spectral corre- signal power can be neglected) in homogeneous at- lation can be neglected. As we see and discuss in section mospheric volumes. And the white color of the noise 3c, this is the probable cause for the major limitation can be obtained in reality, provided the analog chain of of the model. We nevertheless use it since it offers the the lidar is designed carefully. useful possibility of deriving closed-form mathematical In the frequency domain, the basic assumptions men- expression and, most importantly, since it leads to an tioned above result in a signal periodogram that is com- ef®cient model. posed of a peak that rests on top of a bottom ¯oor of To facilitate mathematical derivations, we now sim- noise. The peak represents the atmospheric signatureÐ plify the shape of the periodogram. We transform the that is, the useful part of the signal. The uniform level atmospheric peak into a rectangle (Fig. 1) of height S/ of noise is for the white noise. ␮, extending over ␮ frequency bins. Apart from the

Unauthenticated | Downloaded 09/25/21 05:54 PM UTC JANUARY 1999 DABAS 21

mathematical simpli®cations it brings, the main justi- ®cation for this transformation is to keep the statistics of p unchanged. Under the assumption that the fre- quency components are decorrelated, the statistical dis- tribution of the averaged range gate power p of the simpli®ed periodogram is still as given in (1). Nor- malizing all spectral powers to N/M, the simpli®ed pe- riodogram is now composed of (M Ϫ ␮) noise com- ponents, each with a mean of 1, and ␮ components mixing noise and signal, each with a mean of ␦(⌽) ϭ M/N(S/␮ ϩ N/M) ϭ 1 ϩ⌽/␮. Here, ⌽ϭM(S/N)is the ``photocount'' equal to the number of photons co-

FIG. 1. Schematic representation of the simpli®ed periodogram. It herently detected by the lidar (Frehlich and Yadlowsky is made of M frequency components of independent statistics. Among 1994). them, M±␮ are only noise. The height is N (the total noise energy) over M. The remaining ␮ components are a mixture of noise and the atmospheric echo. They build a rectangular shape of height M/N ϩ b. Probability of good detection S/m ϭ M/N(␾/m ϩ 1), where S is the energy of the atmospheric echo, ␾ ϭ M(S/N), and ␮ is the number of speckles within the range The adaptive ®lter matched to the transformed signal gate. has a rectangular shape of width ␮/(MTs) (that is, ␮ frequency bins) and depth ␦Ϫ1(⌽). The principle of the adaption of the matched ®lter is shown in Fig. 2. The adaption consists of tuning the position of the matched ®lter over the frequency band until the output power is minimum or, conversely, until the maximum amount of

FIG. 2. A schematic representation showing the way the matched ®lter is tuned over the frequency band in order to detect the atmospheric echo. (a) The shape of the matched ®lter (rectangular). (b) The random periodogram (M ϭ 32, atmospheric echo between frequency components 17 and 21). (c) The ®lter is positioned on the atmospheric echo. The shaded components are the peri- odogram of the ®ltered signal. The open bars in the background show the periodogram of the original signal. The space in between the shaded and open bars represents the amount of ®ltered energy. (d) The ®lter is positioned off the atmospheric return.

Unauthenticated | Downloaded 09/25/21 05:54 PM UTC 22 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 16

where the summation extends over ␮ frequency com-

ponents of noise. The probability of ⌬p 2 is equal to the of ␮ negative exponential distributions of average 1. The probability is thus given by 1 prob(⌬p ) ϭ⌬p␮Ϫ1 exp(Ϫ⌬p ). (8) 222⌫(␮)

There is a good frequency estimation when ⌬p1 Ͼ⌬p 2 whatever the position of the ®lter off the atmospheric echo. Conversely, there is a bad estimation when there is a position for the ®lter off the atmospheric echo where

⌬p 2 Ͼ⌬p1. The probability that ⌬p1 Ͼ⌬p 2, whatever the position of the ®lter removed from the atmospheric echo, cannot be derived analytically. There are many possible positions for the ®lter, and most of them are overlapping, so the quantity of energy they are removing FIG. 3. Probability of a bad estimate in the case ⍀ϭ1, M ϭ 64, is partially correlated. Nevertheless, we will assume that and n ϭ 1. The circles are the probabilities derived from the numerical the full probability (i.e., the probability for one esti- simulations. The solid line gives the prediction of Frehlich and Yad- mation to be reliable) is of the form lowsky's (1994) model. The subpanel (log±log plot) zooms on the region of small probabilities of about 1%. ␤ ϩϱ ⌬p1 prob ϭ prob(⌬p ) prob(⌬p ) d⌬pd⌬p , ͵ 1 ͵ 22 1 00[] power is removed. It is then assumed that the ®lter is (9) located on the atmospheric echo. In Fig. 2, the rect- angular shape of the matched ®lter is shown in graph where ␤ is a tunable coef®cient somehow related to the a. Graph b displays the 32 frequency components of a number of possible positions for the ®lter outside of the random periodogram. The atmospheric echo extends atmospheric echo that are independent from the point over the frequency components 17±21. Graph c shows of view of the removed power. We, therefore, expect it the power removed from the signal when the ®lter is to be a function of the ratio M/␮. located on the atmospheric echo. The be- Considering (6) and (8), the probability of a reliable fore and after ®ltering are displayed with open and shad- estimation (9) can be expressed as follows: ed bars, respectively. The amount of ®ltered power is 1 ϩϱ represented by the differences between the bars. Graph prob(⌽) ϭ x␮Ϫ1 exp(Ϫx)F ␤[␦(⌽)x, ␮] dx, ⌫(␮) ͵ d is the same as graph c but the ®lter has been removed 0 from the atmospheric echo. (10) When located directly upon the atmospheric echo, the where amount of power removed from the signal is equal to 1 x ⌬p ϭ p , (5) F(x, ␮) ϭ y␮Ϫ1 exp(Ϫy) dy (11) 1 ͸ k ( ) k ⌫ ␮ ͵ 0 where the summation extends over the ␮ frequency is the incomplete gamma function [F(x, ␮) → 1 when x → ]. components pk, forming the atmospheric echo (k ϭ 17 ϱ to 21 in Fig. 3). For the sake of simplicity, here we have neglected the irrelevant multiplicative term Ϫ1 c. Accumulation ␦ (⌽). The probability of ⌬p1 is thus equal to the con- volution of ␮ negative exponential distributions, each Equation (10) is valid in the single-shot case onlyÐ with a mean of ␦(⌽). The probability is therefore given that is, when there is no accumulation. Accumulation by does not modify the shape of the periodogram: the sim- pli®ed periodgram still holds. But it changes the statis- ␮Ϫ1 1 ⌬p11⌬p tics. Assuming accumulated signals are representaive of prob(⌬p ) ϭ exp Ϫ . (6) 1 ⌫(␮) ␦(⌽)␮ ΂΃␦(⌽) the same measurement (the atmosphere has not changed between lidar shots), each periodogram component re- As in graph d, when the ®lter is removed from the sults from the summation of n exponentially distributed atmospheric echo, the amount of power removed from frequency components (n is the number of accumulated the signal is given by signals). The number of speckles is thus no longer ␮ ⌬p ϭ p , (7) but ␮ ϫ n. Consequently, ␮ must be replaced by ␮ ϫ 2 ͸ j j n in (6) and (8), giving

Unauthenticated | Downloaded 09/25/21 05:54 PM UTC JANUARY 1999 DABAS 23

1 ϩϱ m, we have ⍀ϳ1. The number of samples is then of ␮nϪ1 ␤ prob(⌽) ϭ x exp(Ϫx)F [␦(⌽)x, ␮n] dx. the order of 100 or more. Note that for a CO lidar the ⌫(␮n) ͵ 2 0 (12) AC function is not Gaussian because the pulse is com- posed of a peak (the gain-switched spike) and a tail and, furthermore, is chirped (Willets and Harris 1982). The 3. Numerical validation spectral width w considered here is the width (1 Ϫ ␴) a. Test dataset of the main peak of the pulse power spectrum. As mentioned previously, frequency estimates were For validation, we compare the output of the semi- made with Levin's adaptive ®lter. The principle of the empirical model (12) with estimates of the true prob- estimation is shown in section 2b. For the practical im- ability of bad measurements derived from numerical plementation, we refer to Frehlich and Yadlowsky simulations. Lidar signals are simulated using the pro- (1994). The adaptive ®lter is matched to the signal; that cedure described by Frehlich and Yadlowsky (1994, is, it is equal to the inverse of the expected signal pe- 1229), that is, they are characterized by a Gaussian AC riodogram: function ␥(t) ϭ exp(Ϫ2␲ 2w 2t2), where w is the signal spectrum width [that is, it is the 1 Ϫ ␴ width of the S MϪ1 |k| Ϫ1 Fourier transform of ␥(t)]. For each set of signal pa- ⌿ ( f ) ϭ 1 ϩ ␥(kTss)1Ϫ exp(Ϫ2i␲ fkT ). NMkϭ͸1ϪM ΂΃ rameters (M, ⍀ϭMwTs,n) [we use here the nondi- mensional parameter ⍀ introduced by Frehlich and Yad- (14) lowsky (1994)], we simulate 7500 ϫ n noiseless signals Its derivation requires the knowledge of S/N as well as at ®rst. Then, we add random noise sequences so as to of the AC function ␥(t). In the frame of our simulation obtain a photocount level ⌽ ranging from 0.1 to 1000 work, these are known in advance. For practical appli- (20 values per decade). After that, the signals are pro- cations, however, the ratio S/N must be estimated, and cessed using Levin's estimator, which results in 7500 the AC function is deduced from the transmitted laser independant frequency estimates. At last, the probability pulse. of bad measurement is estimated with the ratiobà (⌽) ϭ nÃ(⌽)/3750, where nÃ(⌽) is the number of frequency es- b. Frehlich and Yadlowsky's model timates falling outside the interval [ϪFs/4, Fs/4] (Fs ϭ 1/T is the sampling frequency) assumed to contain only s Figure 3 is an example of probabilities derived from bad measurements. The associated statistical uncertainty the numerical simulations (circles) (called ``experimen- can be evaluated as follows. Denoting b(⌽) the true tal'' probabilities in the following). The signal param- probability of bad measurement, the 1 Ϫ ␴ uncertainty eters are M ϭ 64, ⍀ϭ1, and n ϭ 1. The solid line is à ␴b(⌽)onb (⌽)is the best ®t achieved by Frehlich and Yadlowsky's (1994) 1/2 model ␴b ϭ {b(⌽)[2 Ϫ b(⌽)]/7500} . (13)

Ϫ␥ It is equal to 0.01 when b(⌽) ϭ 0.5, 0.005 when b(⌽) ⌽ ␣ ϭ 0.1, and 0.0016 when b(⌽) ϭ 0.01. prob(⌽) ϭ 1 Ϫ 1 ϩ . (15) b Regarding signal characteristics, we considered three []΂΃0 different lengths in which M ϭ 32, 64, and 128, and The three tuning parameters b , ␣, and ␥ have been for each M, ®ve different spectrum widths w yielding 0 determined by a (LS) ®t to thebà (⌽) prob- ⍀ϭ0.2, 0.5, 1, 2, and 5. For accumulation, we con- abilities, the LS ®t being restricted to the range of in- sidered at ®rst the three factors n ϭ 1, 5, and 10 (higher terest [0, 0.5]. Within this range, the maximum deviation accumulation factors are studied further herein). These à parameters were chosen so as to cover the entire range fromb (⌽) probabilities is 0.0154, which is within the of signal characteristics treated by Frehlich and Yad- statistical uncertainty. [The worst deviation occurs at ⍀ lowsky (1994) and to thereby produce results compa- ϭ 7.94 forbà (⌽) ϭ 0.35, so assuming the uncertainty rable to theirs. Another concern was to cover most cur- onbà (⌽) is Gaussian, the 95% con®dence interval for is rent lidar applications. According to Huffaker and Har- Ϯ1.96␴b(⌽) ϭϮ0.0172.] desty (1996), two kind of systems are generally envis- The parameters b 0, ␣, and ␥ for the other sets of signal aged: 1) high-repetition-rate, low-energy solid-sate characteristics (M, ⍀, n) are listed in Table 1 (columns lidars, and 2) high-energy, low-repetition-rate CO2 li- 4±6) with, in the farthest right column, the maximum dars. Solid-state lasers deliver short pulses that are ap- difference between the model and the experimental data proximately 200 ns in duration (Frehlich et al. 1994). (in the probability range [0, 0.5]). Measured signals are processed over small range gates It must be noted that in Table 1 three con®gurations of approximately 30 m, which results in ⍀ of the order (M ϭ 32, ⍀ϭ5, and n ϭ 1, 5, and 10) are not ®lled. of 0.2 (Frehlich 1996) and M on the order of 32 or less. The reason is that the procedure described in section 3a

For CO2 systems, however, the spectrum width w is on for estimating the true probability of bad measurement the order of 250 kHz, so for a range gate of about 500 fails because the spectrum width w ϭ 0.16Fs is so large

Unauthenticated | Downloaded 09/25/21 05:54 PM UTC 24 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 16

TABLE 1. Coef®cients b0, ␣, and ␥ of Frehlich and Yadlowski's (1994) model for various lidar con®gurations. The con®gurations

(numbered in column 4) are characterized by ⍀ϭwMTs, M, and n, where w is the spectrum width, M is the number of samples, Ts is the sampling period, and n is the number of accumulations. The farthest column to the right gives the maximum absolute difference between the model and the reliabilities drawn from the numerical simulations.

⍀ Mn b0 ␣␥⑀(%) 0.2 32 1 10.5487 0.9968 0.8149 1.74 5 3.6656 1.5004 3.1804 1.31 10 4.2768 1.5129 7.4994 0.70 64 1 8.6438 1.1645 1.3700 1.15 5 4.6366 1.5262 3.5778 1.01 10 2.5294 1.8278 3.5821 1.07 128 1 14.6030 1.0297 0.8239 1.59 5 5.1184 1.6231 3.6391 1.51 40 2.5300 2.0119 3.1807 1.17 0.5 32 1 12.4548 1.1192 2.5115 1.10 5 4.7257 1.5505 4.3477 2.18 FIG. 4. Same as Fig. 3 but the solid line shows the probabilities 10 7.1844 1.5621 14.8447 0.71 predicted by the semiempirical model. To facilitate the comparison 64 1 13.6491 1.1870 2.3477 0.61 with experimental probabilities (circles), two dashed lines of equation 5 6.4243 1.5715 5.4572 0.92 prob(⌽) Ϯ 1.96{prob(⌽)[2 Ϫ prob(⌽)]/7500}1/2 are drawn. They 10 3.7718 1.8431 5.9235 0.89 delineate the 95% con®dence interval of experimental probabilities 128 1 9.7333 1.4787 1.4910 1.75 conditionally to the hypopthesis that the true probability is prob(⌽). 5 6.1920 1.7404 4.8057 0.63 10 3.5495 1.9516 4.6757 0.91 132113.0174 1.2439 2.9918 0.90 The best ®t (computed over the probability range 5 6.2203 1.5579 5.7404 0.93 [0, 0.5] as usual) is obtained for ␤ ϭ 23.7. To facilitate 10 18.6578 1.5143 48.6942 1.85 the comparison, the ®gure contains two dashed lines of 64 1 23.5632 1.1830 4.4504 1.54 5 34.6139 1.4633 45.7161 0.73 equation prob(⌽) Ϯ 1.96{prob(⌽)[2 Ϫ prob(⌽)]/ 1/2 10 8.6974 0.7026 16.5492 0.87 7500} . Conditionally to b(⌽) ϭ prob(⌽), they delin- 128 1 14.1697 1.4236 2.4721 0.98 eate the 95% con®dence interval onbà (⌽). 5 6.5190 1.8733 4.9069 0.53 The parameter ␤ for the other con®gurations is listed 10 9.4162 1.8492 19.2016 0.66 in Table 2 (column 4) with the corresponding maximum 232134.9263 0.1656 8.0086 1.32 error of prediction (column 5). The errors are slightly 5 110.0779 1.2958 152.2013 1.41 10 82.9645 1.3850 244.5638 1.92 greater than for Frehlich and Yadlowsky's model (the 64 1 27.0213 1.2894 5.6965 0.61 average error is now 0.0151) but nevertheless remain 5 31.5344 1.4887 37.0258 0.87 of the same order of magnitude. A careful analysis 10 49.2995 1.5274 144.8053 1.34 shows that in most cases the errors cannot be considered 128 1 21.5424 1.4392 4.0429 0.77 5 10.3808 1.7950 8.2792 0.78 statistically signi®cant. For instance, Fig. 5 compares 10 19.9189 1.7657 47.2925 1.15 predicted and experimental probabilities in the (worst) 5321 case M ϭ 128, ⍀ϭ0.2, and n ϭ 5. All experimental 5 probabilitiesbà (⌽) are inside the con®dence interval. The 10 only problem is for the three con®gurations M ϭ 32, 64 1 50.0301 1.2937 11.2577 1.72 64, and 128, ⍀ϭ0.2, and n ϭ 1. As illustrated in Fig. 5 96.7304 1.3705 104.3107 1.77 10 59.9064 1.4199 106.3975 1.89 6, the model is then too pessimistic since it predicts 128 1 31.3233 1.4679 5.9985 0.86 probabilities that are greater than real probabilities. A 5 94.3403 1.5361 128.5127 1.36 possible explanation for this relates to the assumed non- 10 34.7813 1.6506 66.1202 1.34 correlation of the signal frequency components that would not be valid here because, as the ®gure shows, bad measurements arise at rather large ⌽ levels. This that the interval (ϪFs/2, ϪFs/4) ʜ (Fs/4, Fs/2) contains explanation is consistent with the fact that the model a signi®cant number of ``good'' measurements. works well for the same signal chacateristics but larger accumulation factors. Then the photocount required for the occurrence of bad measurements is much lower. c. Validation The accuracy of the semiempirical model (12) is in- d. Tunable parameter ␤ vestigated in Fig. 4. Predicted (solid line) and experi- mental (circles) probabilities are compared in the par- In Fig. 7, coef®cient ␤ is plotted versus the ratio M/ ticular case M ϭ 64, ⍀ϭ1, and n ϭ 1 (as in Fig. 3). ␮ (triangles for n ϭ 1, circles for n ϭ 5 and squares

Unauthenticated | Downloaded 09/25/21 05:54 PM UTC JANUARY 1999 DABAS 25

TABLE 2. Coef®cients ␤ tuning the semiempirical model (12). Col- umns 1±4 are the same as in Table 1. In column 5, ␤ is derived from a least squares ®t of (12) to experimental reliabilities. In column 7,

␤ is derived from (16)(␤m). In columns 6 and 8, ␤ is given the maximum difference between the model and the probabilities of bad estimate drawn from the simulations. The maximum is on the prob- ability range [0, 0.5].

⍀ Mn ␤⑀(%) ␤m(16) ⑀m(%) 0.2 32 1 35.81 2.01 41.76 2.22 5 42.15 1.91 41.76 1.81 10 43.02 1.56 41.76 1.76 64 1 71.15 1.91 94.81 2.28 5 98.56 1.83 94.81 2.06 10 96.2596 1.64 94.81 1.74 128 1 153.58 2.26 227.71 2.45 5 225.26 2.44 227.71 2.54 10 235.6 1.79 227.71 1.96 0.5 32 1 21.30 1.40 22.72 1.24 5 22.62 1.75 22.72 1.77 10 24.60 1.32 22.72 1.54 64 1 46.64 0.90 49.32 0.98 FIG. 5. Same as Fig. 4 but M ϭ 128, ⍀ϭ0.2, and n ϭ 5. Although 5 50.27 1.82 49.32 1.91 reaching nearly 2.5%, the discrepencies between experimental (cir- 10 51.61 1.17 49.32 1.24 cles) and predicted (solid line) probabilities cannot be considered 128 1 101.22 1.57 113.28 1.97 signi®cant: experimental probabilities are inside the con®dence in- 5 114.71 1.27 113.28 1.36 terval. 10 119.60 1.89 113.28 1.57 132112.50 0.89 12.29 0.93 5 12.63 1.15 12.29 1.05 10 12.97 2.16 12.29 1.94 sponding ␤m values are drawn with a solid line in Fig. 64 1 23.70 1.76 25.42 2.55 7 and listed in column 6 in Table 2. In column 7 of 5 26.92 1.94 25.42 2.17 Table 2, we report the errors of prediction (that is, the 10 27.29 1.47 25.42 1.71 maximum discrepancy between model and experiment 128 1 48.67 1.09 55.66 2.31 5 57.47 1.23 55.66 1.36 in the probability range [0, 0.5]) achieved by the sem- 10 55.77 1.04 55.66 1.01 iemprirical model with ␤m. The average error is now 23216.98 1.57 6.57 1.71 0.0167 (1.67%), that is, slightly greater than previously 5 6.32 0.99 6.57 1.42 with old ␤ values (the increase is ϩ0.0016). Here again, 10 6.37 1.43 6.57 1.71 except for the three cases M ϭ 32, 64, and 128, ⍀ϭ 64 1 12.77 0.85 12.90 0.97 0.2, and n ϭ 1 already discussed, the errors cannot be 5 12.83 1.33 12.90 1.32 considered statistically signi®cant. For instance, the 10 12.59 1.75 12.90 1.86 128 1 25.09 1.07 26.79 1.76 con®guration leading to the greatest error (0.0255)Ð 5 25.97 1.25 26.79 1.23 which is M ϭ 64, ⍀ϭ1, and n ϭ 1Ðis detailed in 10 24.44 1.55 26.79 1.79 5321 5 10 64 1 5.39 1.90 5.49 2.04 5 5.56 1.28 5.49 1.34 10 5.40 1.51 5.49 1.61 128 1 10.22 1.06 10.60 1.57 5 10.83 1.35 10.60 1.02 10 10.61 1.33 10.60 1.35 for n ϭ 10). The ®gure con®rms that the single param- eter relevant to ␤ is M/␮, since there is no noticeable dependence of ␤ on n (except for the three con®gura- tions M ϭ 32, 64, and 128, ⍀ϭ0.2, and n ϭ 1 already identi®ed as problematic). A second-order polynomial relationship is suggested between log(M/␮) and log(␤):

2 log(␤m) ϭ 0.0586 log(M/␮) ϩ 0.7622 log(M/␮) ϩ 0.6428, (16) FIG. 6. Same as Fig. 4 for the case M ϭ 32, ⍀ϭ0.2, and n ϭ 1. where log( ) designates the natural logarithm. Corre- The semiemprical model fails: it overestimates the true probabilities.

Unauthenticated | Downloaded 09/25/21 05:54 PM UTC 26 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 16

FIG. 7. Coef®cient ␤ vs M/␮. Accumulation factors n ϭ 1, 5, and 10 are indicated with triangles, circles, and squares, respectively. The FIG. 9. Application of the semiempirical model to adress the con- ®guration M 380, 1, and n 1. The agreement with exper- solid line shows (16) empirically relating ␤ to the ratio M/␮. ϭ ⍀ϭ ϭ imental probabilities (circles) is very good. This con®guration could not be adressed with the Frehlich and Yadlowsky (1994) model be- cause M is here much larger than the largest length (128) covered Fig. 8. We see that all experimental probabilities (cir- by Frehlich and Yadlowsky in their paper. cles) are inside the 95% con®dence interval. The equations' good agreement con®rms the ability of e. Long-range gates the model to operate on long signal sequences. We investigate here the extended range of application the semiempirical model has over Frehlich and Yad- f. Large accumulation lowsky (1994). We compare modeled and experimental Large accumulation factors are now considered. First, probabilities obtained for a long-range gate. This is the in Fig. 10, we compare modeled (solid line) and ex- object of Fig. 9, corresponding to M ϭ 380, ⍀ϭ1, perimental probabilitites (circles) in the case M ϭ 64, and n ϭ 1: that is, a typical spaceborne application with ⍀ϭ1, and n ϭ 100. The agreement is good between a spectrum width w of about 250 kHz, a range gate of the two curves, showing that the model still operates length 570 m (that is, a vertical resolution of 500 m, accurately. assuming a pointing angle 30Њ off nadir), and a sampling The validity of the model for variable accumulation period of 10 ns. Equations (4) and (16) give ␤ ϭ 213.4.

FIG. 10. Application of the semiempirical model to address the FIG. 8. Same as Fig. 4 (M ϭ 64, ⍀ϭ1, and n ϭ 1), but con®guration M ϭ 64, ⍀ϭ1, and n ϭ 100. The agreement with coef®cient ␤ is now given by (16). experimental probabilities (circles) is good.

Unauthenticated | Downloaded 09/25/21 05:54 PM UTC JANUARY 1999 DABAS 27

K FIG. 12. Same as Fig. 11 except⌽K is drawn as a function of ⍀. FIG. 11. Minimum single-shot photocount⌽1 required to achieve 1 a probability of bad measurement lower than K% as a function of The solid lines are from Frehlich (1996, Table 2). The processing the accumulation factor. Circles and triangles (for K ϭ 10 and 50, range gate contains M ϭ 500 samples, and the number of accumu- respectively) are drawn form the semiempirical model with the signal lations is 1000. characteristics ⍀ϭ1 and M ϭ 100. The solid and dashed lines show the Frehlich's scaling laws (Frehlich 1996, Table 1) for small and large accumulation factors, respectively. of bad estimates made with Levin's adaptive matched ®lter, we have derived a semiempirical model giving the factors is investigated further in Figs. 11 and 12. Fig. probability of bad estimate (or conversly the reliability) K 11 plots the photocount⌽1 that single signals must have of lidar measurements as a function of the main signal in order to guarantee a probability of bad measurement characteristics ⌽ and ⍀, the length M of the processing lower than K%. The upper curve is for K ϭ 10, and the gate, and the number of accumulation n. It is tuned by lower one is for K ϭ 50. Circles and triangles are de- a single coef®cient ␤ that we have empirically related rived from the semiempirical model, whereas dashed to the ratio of the number of samples M to the number and solid lines correspond to the scaling laws provided of speckles ␮. Since ␮ can be considered a system pa- by Frehlich (1996, Table 1, 656) for small (dashed lines) rameter related to the transmitted laser pulse, the model and large (solid lines) accumulation factors. The signal can be used to predict the reliability of lidar measure- parameters are ⍀ϭ1 and M ϭ 100. The semiempriral ments without a priori knowledge of the dynamics of model agrees well with the scaling laws. sounded atmospheric volumes. K In Fig. 12, the photocount threshold⌽1 is plotted The model has been validated for a full range of char- versus ⍀, ranging from 0.2 to 100. As in Fig. 11, circles acteristics (M, ⍀, n) that I believe are representative of and triangles show the results drawn from the semiem- most lidar applications. The main limitation that was pirical model (K ϭ 10 for the circles and K ϭ 50 for found concerns small spectrum widths (⍀ϭ0.2) when the triangles). The results are compared to the scaling there is no accumulation. Then, predicted probabilities laws proposed by Frehlich (1996, Table 2, 657) for M of bad estimates are slighlty greater than real ones. ϭ 500 and n ϭ 1000. Noticeable discrepancies appear Although this study is based on Levin's estimator, it ®rst for K ϭ 50 and ⍀Ͼ20. The results con®rm rather offers a good sense of what frequency estimators can than contradict the model's quality; note that Fig. 19 in do for lidar applications because, as shown by Frehlich Frehlich's paper reveals that the true photocount thresh- and Yadlowsky (1994), Levin's estimator achieves near- old (shown by triangles) is below the scaling law, which ly optimal performances. is just what the semiempirical model predicts. The semi- In this study, as far as accumulation is concerned, I empirical model does, however, make small but signif- have considered only signals with the same frequency icant errors at narrow spectrum widths (small ⍀). But content. In reality, accumulated signals result from the the errors amount to less than 0.5 dB of SNR, so the sounding of close but different atmospheric volumes and ability for the semiempirical model to provide useful would therefore have variable frequencies. The impact information on SNRs (or photocounts) required for giv- of the natural variability of the wind on the quality of en probabilities of bad estimate is con®rmed. the measurement is still an open question.

Acknowledgments. The author acknowledges useful 4. Conclusions discussions with P. H. Flamant and Ph. Drobinski from Starting from a simpli®ed model of signal periodo- CNRS. This work was supported by the European Space grams and conceptual considerations on the occurrence Agency.

Unauthenticated | Downloaded 09/25/21 05:54 PM UTC 28 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 16

REFERENCES mospheric wind velocities using solid-state and CO2 coherent laser systems. Proc. IEEE, 84, 181±204. Kavaya M. J., 1995: Design of a Doppler lidar for global pro®ling Ancellet G. M., R. T. Menzies and W. B. Grant, 1989: Atmospheric of tropospheric winds. Coherent Laser Radar, 1995 Tech. Dig. velocity spectral width measurements using the statistical dis- Ser., 19, 50±53. tribution of pulsed CO2 lidar return intensities. J. Atmos. Oceanic Technol., 6, 50±58. Levin J. M., 1965: Power spectrum parameter estimation. IEEE Trans. Baker W. E., and Coauthors, 1995: Lidar-measured winds from space: Inf. Theory, 11, 100±107. A key component for weather and climate prediction. Bull. Amer. Marini A., and A. Culoma, 1995: ESA progress in spaceborne Dopp- Meteor. Soc., 76, 869±888. ler wind lidar activities. Coherent Laser Radar, 1995 Tech. Dig. Churnside J. H., and H. T. Yura, 1983: Speckle statistics of atmos- Ser., 19, 29±29. pherically backscattered laser light. Appl. Opt., 22, 2559±2565. Rye B. J., 1995: Return power estimation for targets spread in range. ESA, 1996: Reports for assessment: The nine candidate Earth Ex- Proc. 8th Int. Conf. on Coherent Laser Radar, Keystone, CO, plorer missions: Atmospheric Dynamics Mission. ESA SP- OSA, 202±205. 1196(4), 70 pp. [Available from ESA Publications Division, P.O. , and R. M. Hardesty, 1993a: Discrete spectral peak estimation Box 299, 2200 AG, Noordwijk, the Netherlands.] in incoherent backscatter heterodyne lidar. I. Spectral accumu- Frehlich, R., 1996: Simulation of coherent Doppler lidar performance lation and the Cramer-Rao lower bound. IEEE Trans. Geosci. in the weak-signal regime. J. Atmos. Oceanic Technol., 13, 646± Remote Sens., 31, 16±27. 658. , and , 1993b: Discrete spectral peak estimation in inco- , and M. J. Yadlowsky, 1994: Performance of mean-frequency herent backscatter heterodyne lidar. II. Correlogram accumula- estimators for Doppler radar and lidar. J. Atmos. Oceanic Tech- tion. IEEE Trans. Geosci. Remote Sens., 31, 28±35. nol., 11, 1217±1230. , and , 1997: Detection techniques for validating Doppler , S. M. Hannon and S. W. Henderson, 1994: Performance of a estimates in heterodyne lidar. Appl. Opt., 36, 1940±1951. 2-␮m coherent Doppler lidar for wind measurements. J. Atmos. Willets D. V., and M. R. Harris, 1982: An investigation into the origin

Oceanic Technol., 11, 1517±1527. of frequency sweeping in a hybrid TEA CO2 laser. J. Phys. D, , , and , 1997: Coherent Doppler lidar measurements 15, 51±67. of winds in the weak signal regime. Appl. Opt., 36, 3491±3499. Zrnic D. S., 1980: Spectral statistics for complex colored discrete- Goodman J. W., 1985: Statistical Optics. Wiley, 550 pp. time sequence. IEEE Trans. Acoust., Speech, Signal Process., Huffaker R. M. and R. M. Hardesty, 1996: Remote sensing of at- 28, 596±599.

Unauthenticated | Downloaded 09/25/21 05:54 PM UTC