Regularized Matrix Estimation in Complex Elliptically Symmetric Distributions Using the Expected Likelihood Approach - Part 2: The Under-Sampled Case Olivier Besson, Yuri Abramovich

To cite this version:

Olivier Besson, Yuri Abramovich. Regularized Covariance Matrix Estimation in Complex Elliptically Symmetric Distributions Using the Expected Likelihood Approach - Part 2: The Under-Sampled Case. IEEE Transactions on Processing, Institute of Electrical and Electronics Engineers, 2013, vol. 61, pp. 5819-5829. ￿10.1109/TSP.2013.2285511￿. ￿hal-00904983￿

HAL Id: hal-00904983 https://hal.archives-ouvertes.fr/hal-00904983 Submitted on 15 Nov 2013

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés.

Open Archive Toulouse Archive Ouverte (OATAO) OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible.

This is an author-deposited version published in: http://oatao.univ-toulouse.fr/ Eprints ID: 10074

To link to this article: DOI: 10.1109/TSP.2013.2285511 URL: http://dx.doi.org/10.1109/TSP.2013.2285511

To cite this version: Besson, Olivier and Abramovich, Yuri Regularized Covariance Matrix Estimation in Complex Elliptically Symmetric Distributions Using the Expected Likelihood Approach - Part 2: The Under-Sampled Case. (2013) IEEE Transactions on , vol. 61 (n° 23). ISSN 1053-587X

Any correspondence concerning this service should be sent to the repository administrator: [email protected]

Regularized Covariance Matrix Estimation in Complex Elliptically Symmetric Distributions Using the Expected Likelihood Approach—Part 2: The Under-Sampled Case Olivier Besson, Senior Member, IEEE, and Yuri I. Abramovich, Fellow, IEEE

Abstract—In the Þrst part of these two papers, we extended the characterization of the signal to ratio loss of adaptive expected likelihood approach originally developed in the Gaussian Þlters [1] or, for detection problems, the now classic papers by case, to the broader class of complex elliptically symmetric (CES) Kelly [2]–[4] about generalized likelihood ratio test (GLRT) distributions and complex angular central Gaussian (ACG) dis- in unknown Gaussian noise or the adaptive subspace detectors tributions. More precisely, we demonstrated that the density function (p.d.f.) of the likelihood ratio (LR) for the (un- of [5]–[7] in partially homogeneous noise environments. All known) actual scatter matrix does not depend on the latter: of them highly beneÞt from the beautiful and rich theory it only depends on the density generator for the CES distribution of multivariate Gaussian distributions and Wishart matrices and is distribution-free in the case of ACG distributed data, i.e., it [8]–[10] and have served as references for decades. At the only depends on the matrix dimension and the number of inde- core of adaptive Þltering or adaptive detection is the problem pendent training samples , assuming that . Additionally, of estimating the disturbance covariance matrix. It is usually regularized scatter matrix estimates based on the EL methodology addressed through the maximum likelihood (ML) principle, were derived. In this second part, we consider the under-sampled scenario which deserves speciÞc treatment since con- mainly because ML estimates have the desirable property of ventional maximum likelihood estimates do not exist. Indeed, in- being asymptotically efÞcient [11], [12]. However, in low ference about the scatter matrix can only be made in the -dimen- sample support, their performance may degrade and they can sional subspace spanned by the columns of the data matrix. We be signiÞcantly improved upon using regularized covariance extend the results derived under the Gaussian assumption to the matrix estimates (CME), such as diagonal loading [13], [14]. CES and ACG class of distributions. Invariance properties of the Moreover, the ML results in the ultimate equal to one under-sampled likelihood ratio evaluated at are presented. Re- markably enough, in the ACG case, the p.d.f. of this LR can be likelihood ratio (LR), a property that is questionable, as argued written in a rather simple form as a product of beta distributed in [15]–[17]. In the latter references, it is proved that the LR, random variables. The regularized schemes derived in the Þrst evaluated at the true covariance matrix , has a probability part, based on the EL principle, are extended to the under-sam- density function (p.d.f.) that does not depend on but only on pled scenario and assessed through numerical simulations. the sample volume and the dimension of the observation Index Terms—Covariance matrix estimation, elliptically space, i.e., number of antennas or pulses. More importantly, symmetric distributions, expected likelihood, likelihood ratio, with high probability the LR takes values much lower than regularization. one and, therefore, one may wonder if an estimate whose LR signiÞcantly exceeds that of the true covariance matrix is reasonable. Based on these results, the expected likelihood I.INTRODUCTION (EL) principle was developed in [15]–[17] with successful he Gaussian assumption has been historically the domi- application to adaptive detection or direction of arrival (DoA) T nating framework for adaptive detection problems, estimation. In the former case, regularized estimation schemes partly because of the richness of statistical tools available were investigated with a view to drive down the LR to values to derive detection/estimation schemes and to assess their that are compliant with those for , the true underlying performance in Þnite sample problems. The most famous covariance matrix. As for DoA estimation, the EL approach examples include the celebrated Reed Mallet Brennan rule for was instrumental in identifying severely erroneous MUSIC DoA estimates (breakdown prediction) and rectifying the set of these estimates to meet the expected likelihood ratio values (breakdown cure) [15], [16]. However, in a number of applications, the Gaussian as- sumption may be violated and detection/estimation schemes based on this assumption may suffer from a certain lack of O. Besson is with the University of Toulouse, ISAE, Department Electronics robustness, resulting in signiÞcant performance degradation. Optronics Signal, 31055 Toulouse, France (e-mail: [email protected]). Therefore, many studies have focused on more accurate radar Y. I. Abramovich is with the W. R. Systems, Ltd., Fairfax, VA 22030 USA data modeling along with corresponding detection/estimation (e-mail: [email protected]). schemes. In this respect, the class of compound-Gaussian models, see e.g., [18]–[20], has been extensively studied. The radar return is here modeled as the product of a positive valued random variable (r.va.) called texture and an independent rive this likelihood ratio for under-sampled training conditions complex Gaussian random vector (r.v.) called speckle, and is in the case of complex ACG distributed data. For referred to as a spherically invariant random vector (SIRV). Gaussian distributed data, the under-sampled scenario has been Since exact knowledge of the p.d.f. of the texture is seldom studied in [30], [31] where the EL approach was used to detect available, the usual way is to treat the textures as unknown outliers produced by MUSIC DoA estimation, and in [32], [33] deterministic quantities and to carry out ML estimation of the for adaptive detection using regularized covariance matrix es- speckle covariance matrix [21]–[24]. This approach results timates. As explained in [30], this scenario requires a speciÞc in an implicit equation which is solved through an iterative analysis since (unstructured) maximum likelihood estimates do procedure. SIRV belong to a broader class, namely complex not longer exist, and information about the covariance matrix elliptically symmetric (CES) distributions [25], [26] which can be retrieved only in the -dimensional subspace spanned by have recently been studied for array processing applications, the data matrix. Moreover, in deriving an under-sampled like- see [27] and references therein. A CES distributed r.v. has lihood ratio , some requirements are in force. Of course, a stochastic representation of the form where should lie in the interval and maximization of the is the scatter matrix, is called the modular variate and likelihood ratio should be associated to maximization of the is independent of the complex random vector which is uni- , at least over a restricted set. Additionally, formly distributed on the complex -sphere. In most practical the p.d.f. of , when evaluated at the true covariance ma- situations, the p.d.f. of is not known, and therefore there is trix, should depend only on and , so as to implement an an interest to estimate irrespective of it. A mechanism to EL approach. Finally, when , should coincide with achieve this goal is to normalize as whose p.d.f. its over-sampled counterpart. In the sequel, we build upon the is described by the complex angular central Gaussian (ACG) theory developed in [30] and extend it to the case of ACG dis- distribution and is speciÞed by the scatter matrix only. There tributions. is thus a growing interest in deriving scatter matrix estimates A vector is said to have a complex angular central Gaussian (SME) within the framework of CES or ACG distributions, (ACG) distribution, which we denote as , if see the comprehensive reviews of Esa Ollila et al. in [27] and it can be written as where follows a complex Ami Wiesel in [28]. In the Þrst part [29] of this series of pa- central Gaussian distribution, i.e., . For non- pers, we addressed this problem using the EL approach. More singular , the p.d.f. of is given by [27], [34], [35] precisely, we extended the EL principle from the Gaussian framework to the CES and ACG distributions, and proved (1) invariance properties of the LR for the true scatter matrix . The over-sampled scenario only was considered in where proportional to. In fact, for any vector [29]. However, in some applications the number of antenna which follows a central CES distribution with elements exceeds the number of i.i.d. training samples scatter matrix and density generator , the p.d.f. of and therefore the under-sampled scenario is of is still given by (1), and therefore (1) is the density for utmost importance. This case deserves a special treatment as a large class of scaled vectors. Note that in (1) is identiÞable MLE do not longer exist and inference about the scatter matrix only up toa scaling factor and can be seen as a shape matrix.Let is possible only in the -dimensional subspace spanned by the us assume that we have a set of i.i.d. samples drawn from columns of the data matrix [30]. The goal of this paper is thus the p.d.f. in (1). Then, the joint distribution of to extend the results of [30], which deals with Gaussian data, can be written as to CES and ACG distributed data and to complement [29] by (2) considering . Accordingly, the regularized estimation schemes developed in [29] will be adapted to this new case. As we hinted at above, CES distributions rely on the knowledge Let us then consider the likelihood ratio for testing a parametric of the p.d.f. of the modular variate while ACG distributions do scatter (or shape) matrix model where is a set of not. Therefore, in the sequel, we will concentrate on the ACG that uniquely specify the scatter matrix model. In case. [29], we derived the LR for over-sampled training conditions More precisely, in Section II, we derive the LR for ACG dis- ( ) and showed that tributions in the under-sampled case. We demonstrate its invari- ance properties and show that, for , it coincides with the over-sampled LR of [29]. The case of CES distributions is ad- dressed in the Appendix. In Section III we brießy review the regularized estimates of [29] and indicate how their regulariza- (3) tion parameters are chosen in the under-sampled case. Numer- ical simulations are presented in Section IV and our conclusions are drawn in Section V. where is the maximum likelihood estimate of , and is the unique (up to a scaling factor) solution [36] to

II. LIKELIHOOD RATIO FOR COMPLEX ACGDISTRIBUTIONS IN THE UNDER-SAMPLED CASE (4) As said previously, the likelihood ratio (and its p.d.f. when evaluated at the true (covariance) scatter matrix) is the funda- Let us now turn to the under-sampled scenario with . mental quantity for the EL approach. In this section, we de- Obviously, with training samples, any inference re- garding the scatter matrix may be provided only regarding Then and its p.d.f. is given by its projection onto the -dimensional subspace spanned by the columns of , or equivalently by the columns of the -variate matrix of eigenvectors associated with the non-zero eigenvalues of the sample matrix , where stands for the diagonal matrix of the where stands for the exponential of the trace of the matrix eigenvalues. As already noted, whether between braces. Since the Jacobian from to is 1, [37] deÞnes or , we still have the normalized vectors a singular Gaussian density as . Therefore, without loss of generality, we may consider the vectors as being generated by complex Gaussian (7) random vectors . For any given candidate , we need to Þnd the full rank Hermitian matrix 1, such that the for those vectors such that and . Let construct is “closest” to the model us now consider and deÞne . In [30] it was demonstrated that may be speci- Þed by the condition that the generalized non-zero eigenvalues of the matrix pencil are all equal to one, i.e., . Since , the generalized eigenvalues , are the same Then follows an ACG distribution with as the non-zero eigenvalues of the -variate Hermitian ma- p.d.f. trix or, since , the non-zero eigenvalues of the -variate Hermitian matrix , which immediately leads to the solution [30] Following Siotani et al., one can thus deÞne a singular ACG density as (5) (8) and for and .A (6) set of independent snapshots can thus be represented as , with density Note that for any (arbitrary) matrix , we might con- struct the corresponding and (9)

: the latter gathers what can be inferred of from the observation of snapshots. Let us assume that is known. For , differentiating (9) It is important to note that for the given generating set of with respect to (for Þxed ), it follows that the MLE of i.i.d Gaussian data , , the scatter matrix satisÞes, see also (4) may be treated as an admissible singular covariance matrix model. At this stage, we need to deÞne ACG distributions with sin- (10) gular covariance matrices and we will follow the lines of Siotani et al. [37] who considered singular Gaussian distributions. Let Furthermore, let us consider the speciÞc case where the rank be Gaussian distributed with a rank-deÞcient covari- of equals the number of available observations, i.e., . ance matrix where is a orthonormal Assuming that the matrix is non-singular, one has matrix whose columns span the range space of and is a positive deÞnite Hermitian (PDH) matrix. Note that fully resides in the subspace spanned by with probability one [10], (11) [37]. Let denote an orthonormal basis for the complement Indeed, in this case, one has of , i.e., and . Let . Hence and let us make the change of variables (12)

which proves that veriÞes (10) for , and 1We should have denoted and to emphasize that these matrices are constructed from but, for the ease of notation, we hence is the MLE in this case. This observation is of utmost simplify to and . importance when we consider the under-sampled case. Indeed, for our speciÞc application with in (6) being Reporting this value in (3) yields, for an admissible singular covariance matrix, we get

(17)

which coincides with in (15) when . Let us now prove that similarly to the over-sampled (13) case, for the true scatter matrix , the p.d.f. for does not depend on . Observing that where or , it The previous equation provides the likelihood func- tion for the parameterized scatter matrix . In order ensues that to obtain the likelihood ratio in under-sampled condi- tions , we need to Þnd the global ML maximum of over the PDH matrix . (18) As proved in (11), this MLE is simply which is obviously distribution-free. More insights into the (14) distribution of can be gained by noticing that the matrix has a complex Wishart distribution where is the diagonal matrix of the eigenvalues of . with degrees of freedom, i.e., . Let Therefore, for an under-sampled scenario, we may us consider the Bartlett decomposition where use the under-sampled likelihood ratio which can be written in is an upper-triangular matrix and all random variables the following equivalent forms: are independent [8]. Moreover, where stands for the complex central chi-square distri- bution with degrees of freedom, whose p.d.f. is given by . Additionally, one has for . It then ensues that

(19)

where stands for the beta distribution. The repre- sentation (19) makes it very simple to estimate the distribu- tion of . Additionally, the average value (15) of can be obtained in a straightforward manner as

It is noteworthy that is invariant to scaling (20) of . Let us now investigate the properties of this under-sampled likelihood ratio. Let us Þrst prove that, for T = M, the under-sam- This average value (or the median value) can serve as a target pled LR (15) coincides with its over-sampled counterpart value for the likelihood ratio associated with any scatter matrix in (3). To do so, one needs to derive an estimate. expression for Tyler’s MLE in (4). In fact, using deriva- To summarize, for under-sampled ( ) training condi- tions similar to those which led to (10), one can show that, for tions and ACG data with , we introduced the likelihood ratio , since and that for the true scatter matrix is described by hence a scenario-invariant p.d.f. fully speciÞed by parameters and . While an analytical expression for the above men- (16) tioned p.d.f. is not available, it can be pre-calculated for some given and by Monte-Carlo simulations, using either simulated i.i.d Gaussian r.v. , cf. Equation where is given by (18) or beta distributed random variables, cf. Equation (19). In the Appendix, we derive the under-sampled likelihood (26) ratio for CES distributed samples . We show that, when evaluated at , its p.d.f. does not depend on but still depends on the density We will also consider regularized TVAR( ) estimates, generator , similarly to what was observed in the over-sam- namely the Dym-Gohberg regularization of (21) pled case [29]. (27)

III. REGULARIZED SCATTER MATRIX ESTIMATION USING THE where is the Dym-Gohberg band-inverse transforma- EXPECTED LIKELIHOOD APPROACH tion of a Hermitian non negative deÞnite matrix, deÞned as [40] For the sake of clarity, we here brießy review the regular- ized scatter matrix estimates (SME) which were introduced and (28a) studied in part 1 for . More precisely, we focus on the schemes which were shown to achieve the best performance. (28b) The Þrst estimate is the conventional diagonal loading estimate Accordingly, we investigate the Þxed point diagonally (21) loaded TVAR( ) estimate [29] deÞned as (a formal proof of convergence of this itera- We also consider the Þxed point diagonally loaded estimator tive scheme is still an open issue) where [28], [38], [39] where is ob- tained from the following iterative algorithm

(22a)

(22b) (29a) (29b) We refer to as FP-DL in what follows. Both estimates are governed by the loading factor which is chosen according to the EL principle, i.e., will be referred to as FP-DG-DL in the sequel. For those (Þxed-point) diagonally loaded TVAR( ) estimates, the value of is also selected according to the EL principle, i.e., (23) where is the scenario-invariant p.d.f. of the -th root of in (19), stands (30) for the median value and is the under- sampled LR of (15): IV. NUMERICAL SIMULATIONS Similarly to [29], we consider the case of data distributed according to a multivariate Student -distribution with degrees of freedom: (24) (31) In other words, the loading factor is such that In all simulations below, we use . We consider a ULA is closest to the median with elements. The true scatter matrix was considered value of . For comparison purposes, we to be as per AR(1) process will consider the Oracle estimator of [39] deÞned through the following choice of : with . The SNR loss factor

(32) (25) Fig.1. Medianvalueof and Fig. 3. Probability density function of . and versus . . .

Our second simulation deals with the inßuence of the loading factor ontheSNRlossaswellasontheLR,seeFig.4.Ascan be observed, the diagonally loaded estimates are not very sensi- tive to variations in , at least when the SNR loss is concerned. Their LR however is seen to vary. In contrast, TVAR( ) esti- mates (especially DG-DL) have a SNR loss which exhibits large variations when is varied: the latter should be chosen rather small in order to have a good SNR loss. One can also observe a correlation between SNR loss and LR: when increases, both of them decrease. Whatever the estimate, it appears that choosing according to the EL principle (23)–(30) results in negligible SNR losses, although the LR could be quite far from the median without penalizing too much SNR for the diagonally loaded es- timates. Fig. 5 displays SNR loss versus number of snapshots. The average value of the loading factor selected from the EL principle is also plotted, as is the average value of for the Oracle estimator. A few Fig. 2. Probability density function of . and remarks are in order here. First, it can be seen that the LR . for the Oracle estimator is close but slightly different from : at least, it is not as close as in the will serve as the Þgure of merit for quality assess- over-sampled case. More important is the fact that the FP-DL ment of the . Above, is a generic SME and with the EL principle for choosing outperforms the Oracle stands for the estimator: this is due to the fact that EL selects a higher loading steering vector corresponding to the looked direction which level, i.e., , in order to have a lower LR. This is a is set to . quite remarkable result which shows that the minimization of We Þrst examine the distribution of the MSE between and does not result in the highest . Fig. 1 displays the median value SNR in low sample support. As a second observation, notice of versus : we also plot in this Þgure the that the FP diagonally loaded TVAR(1) estimate provides the value of in the over-sampled highest SNR, which was also observed in the over-sampled case. This Þgure conÞrms that for , the under-sam- case. pled and over-sampled median values coincide. As can be Similarly to Part 1, we now consider estimation of both observed, the median value of decreases when and for estimates. We use the same pro- increases, achieves a minimal value for and then cedure as in [29]. For Þxed , we follow the rule in (30) to increases when increases. Figs. 2–3 display the p.d.f. select . Then, we estimate as the minimal order for which for and respectively. As can be seen, can take very small values and, as increases, the support of this p.d.f. is smaller. Fig. 4. Performance of diagonally loaded estimates versus . and . (a) SNR loss. (b) Mean value of . is above a threshold:

(33) where is the quantile of , i.e., . Since the minimum value of is , is necessarily in the interval . If none of the orders yield a LR which exceeds the threshold, then we select the model order which results in the LR closest to the median. As in Part 1, we still consider the case of an AR(1) scatter matrix and we also consider a case where the element of corresponds to the Fig. 5. Performance of regularized estimates versus number of snapshots . -th correlation lag of an process whose .(a)SNRloss(b)Meanvalueof (c) Mean value of loading factor. spectrum (correlation) is close to but different from that of the AR(1) process. The SNR loss and average LR for the FP-DL, and case. In these Þgures, the two solid black lines are displayed in Fig. 6 for the AR(1) case and Fig. 7 for the represent the threshold and . First, it is Fig. 7. Performance of Þxed-point diagonally loaded estimates Fig. 6. Performance of Þxed-point diagonally loaded estimates versus number of snapshots in the case. . (a) SNR versus number of snapshots in the AR(1) case. . (a) SNR loss. (b) Mean value of . loss (b) Mean value of .

V. CONCLUSIONS noteworthy that in the AR(1) case, the EL principle selects in In this paper, we extended the EL approach of[30] to the class the vast majority of cases which corresponds to the true of CES and ACG distributions in the under-sampled case, where model order. However, in contrast to the over-sampled case, the number of samples is less than the dimension of the this may not be the best choice as orders results in better observation space. Together with the over-sampled case treated SNR at the price of lower LR. For instance, it seems that in Part 1 [29], this offers a general methodology to regularized yields the highest SNR but the corresponding LR is below the scatter matrix estimation for a large and practically important threshold . Next, note that FP-DG-DL outperforms FP-DL, class of distributions. We demonstrated that the LR evaluated which is reasonable since belongs to the class of TVAR( ) at the true scatter matrix still enjoys the same type of in- matrices. The ARMA( ) case yields different results. As properties that were found in the Gaussian case. This noted in [29], FP-DL is now better than Þxed-point diagonally invariance makes it possible to assess the quality of any scatter loaded TVAR( ) estimates: the latter have lower SNR and LR matrix estimate, and a useful tool to tune the regularization pa- which are below the threshold, yielding matrices that are not rameters of regularized SME. This was demonstrated in the case admissible. These two simulations conÞrm that FP-DL is an of Þxed-point diagonally loaded estimates, where the Oracle es- ubiquitous estimate which can accommodate various types of timator was shown to achieve a LR very close to the median scatter matrices. value of which also corresponds to the target LR of the EL-based estimate. Accordingly, we developed regularized model which yields estimation schemes based on modeling and inves- tigated their use in conjunction with diagonal loading. For this and . From the pre- shrinkage to the structure methodology, the EL approach was vious deÞnition of singular CES distributions, we may write the also efÞcient in providing estimates of both the model order and joint p.d.f. of as the loadingfactor that yields SNR values veryclose to that of the optimal (clairvoyant) Þlter. The framework and methodology of this two-part paper has been demonstrated for adaptive Þltering, but it can also serve as a useful framework for other problems that call for Þtting of a parametrically-controlled covariance or scatter matrix to under-sampled data.

APPENDIX (38) LIKELIHOOD RATIO FOR CESDISTRIBUTIONS IN THE UNDER-SAMPLED CASE In order to obtain the LR, we need to maximize In this appendix, we derive the likelihood ratio for under- sampled training conditions in the case of CES distributions. over the PDH matrix . As argued in Let us start with a r.v. where is a rank- (37), the MLE of is the solution to matrix which can be decomposed as where is a orthonormal matrix whose columns span the range (39) space of and is a positive deÞnite Hermitian matrix. can be represented as [27] It follows that, for , the under-sampled likelihood ratio (34) is given by where . Let denote an orthonormal basis for the complement of and let . Let us make the change of variables

Then and its p.d.f. is given by

Since the Jacobian from to is 1, one may deÞne a singular CES density as (35) (40) for vectors such that . The joint density of a set of independent snapshots can thus be written as Let us now prove that, for , the under-sam- pled LR (40) coincides with its over-sampled counterpart (36) , which is given by [29]

Assuming that isknown, for , the MLE of satisÞes, see [27], (41)

(37) where corresponds to the MLE of and satisÞes

Let us now consider snapshots . As noted in the ACG case, inference about the scatter matrix (42) is possible only in the -dimensional subspace spanned by the columns of the -variate matrix of eigenvectors associated with the non-zero eigenvalues of the sample ma- with . Similarly to the ACG case, we need to obtain the MLE in this special case . Let us then trix . Again, for any given , we prove that for need to Þnd the rank- Hermitian matrix , such that the construct is “closest” to the (43) where is given in (39). First, observe that is a square [6] S. Kraut, L. L. Scharf, and T. McWhorter, “Adaptive subspace detec- tors,” IEEE Trans. Signal Process., vol. 49, no. 1, pp. 1–16, Jan. 2001. non-singular matrix so that in (43) is non singular [7] S. Kraut, L. L. Scharf, and R. W. Butler, “The adaptive coherence and its inverse is . Now, let us pre- estimator: A uniformly most powerful invariant adaptive detection statistic,” IEEE Trans. Signal Process., vol. 53, no. 2, pp. 427–438, multiply (39) by and post-multiply it by to obtain Feb. 2005. [8] N. R. Goodman, “Statistical analysis based on a certain multivariate complex Gaussian distribution (An introduction),” Ann. Math. Statist., vol. 34, no. 1, pp. 152–177, Mar. 1963. (44) [9] C. G. Khatri, “Classical statistical analysis based on a certain multi- variate complex Gaussian distribution,” Ann. Math. Statist., vol. 36, no. 1, pp. 98–114, Feb. 1965. which coincides with (42). Using this expression in (40), we [10] R. J. Muirhead, Aspects of Multivariate Statist. Theory. New York, have that for NY, USA: Wiley, 1982. [11] L. L. Scharf, Statist. Signal Processing: Detection, Estimation and Anal. Reading. Reading, MA, USA: Addison-Wesley, 1991. [12] S. M. Kay, Fund. of Statist. Signal Processing: Estimation Theory. (45) Englewood Cliffs, NJ, USA: Prentice-Hall, 1993. [13] Y. I. Abramovich, “Controlled method for adaptive optimization of Þlters using the criterion of maximum SNR,” Radio Eng. Elect. Phys., vol. 26, pp. 87–95, Mar. 1981. which is exactly the over-sampled likelihood ratio of (41). [14] O. P. Cheremisin, “EfÞciency of adaptive algorithms with regularised Finally, let us investigate the distribution of sample covariance matrix,” Radio Eng. Electron. Phys.,vol.27,no.10, pp. 69–77, 1982. . Since , it follows [15] Y. Abramovich, N. Spencer, and A. Gorokhov, “Bounds on maximum from (40) that likelihood ratio-Part I: application to antenna array detection-estima- tion with perfect wavefront coherence,” IEEE Trans. Signal Process., vol. 52, no. 6, pp. 1524–1536, Jun. 2004. [16] Y. I. Abramovich, N. K. Spencer, and A. Y. Gorokhov, “Glrt-based threshold detection-estimation performance improvement and applica- tion to uniform circular antenna arrays,” IEEE Trans. Signal Process., (46) vol. 55, no. 1, pp. 20–31, Jan. 2007. [17] Y. I. Abramovich, N. K. Spencer, and A. Y. Gorokhov, “ModiÞed where and GLRT and AMF framework for adaptive detectors,” IEEE Trans. . Moreover, pre-multiplying (39) by Aerosp. Electron. Syst., vol. 43, no. 3, pp. 1017–1051, Jul. 2007. [18] K. Yao, “A representation theorem and its application to spherically and post-multiply it by , it ensues that invariant processes,” IEEE Trans. Inf. Theory, vol. 19, no. 5, pp. 600–608, Sep. 1973. [19] E. Conte and M. Longo, “Characterisation of radar clutter as a spheri- (47) cally invariant process,” Proc. Inst. Electr. Eng.—Radar, Sonar, Navig., vol. 134, no. 2, pp. 191–197, Apr. 1987. [20] E. Conte, M. Lops, and G. Ricci, “Asymptotically optimum radar de- tection in compound-Gaussian clutter,” IEEE Trans. Aerosp. Electron. Note that is a generalized inverse of , i.e., and Syst., vol. 31, no. 2, pp. 617–625, Apr. 1995. . Unlike the Moore-Penrose pseudo-inverse, [21] F. Gini and M. Greco, “Covariance matrix estimation for CFAR detec- the generalized inverse is not unique. In this regard, note that tion in correlated heavy tailed clutter,” Signal Process., vol. 82, no. 12, pp. 1847–1859, Dec. 2002. is the unique Moore-Penrose pseudo-inverse [22] E. Conte, A. D. Maio, and G. Ricci, “Recursive estimation of the co- of the matrix . Therefore, by specifying a particular variance matrix of a compound-Gaussian process and its application to adaptive CFAR detection,” IEEE Trans. Signal Process., vol. 50, no. (Hermitian say) square root of we uniquely specify 8, pp. 1908–1915, Aug. 2002. the matrices and . Finally, from (47), the properties of [23] F. Pascal, Y. Chitour, J.-P. Ovarlez, P. Forster, and P. Larzabal, the matrices and are entirely speciÞed by a set of i.i.d “Covariance structure maximum-likelihood estimates in compound Gaussian noise: Existence and algorithm analysis,” IEEE Trans. complex uniform vectors . This means that the Signal Process., vol. 56, no. 1, pp. 34–48, Jan. 2008. distribution of does not depend on but of course depends [24] Y. Chitour and F. Pascal, “Exact maximum likelihood estimates for on , similarly to the over-sampled case. It results that the SIRV covariance matrix: Existence and algorithm analysis,” IEEE p.d.f. of is independent of . Trans. Signal Process., vol. 56, no. 10, pp. 4563–4573, Oct. 2008. [25] K. T. Fang and Y. T. Zhang, Generalized Multivariate Analysis. Berlin, Germany: Springer-Verlag, 1990. REFERENCES [26] T. W. Anderson and K.-T Fang, “Theory and applications of ellip- tically contoured and related distributions,” Statist. Dept., Stanford [1] I. S. Reed, J. D. Mallett, and L. E. Brennan, “Rapid convergence rate Univ., Stanford, CA, USA, Tech. Rep. 24, 1990. in adaptive arrays,” IEEE Trans. Aerosp. Electron. Syst., vol. 10, no. 6, [27] E. Ollila, D. Tyler, V. Koivunen, and H. Poor, “Complex elliptically pp. 853–863, Nov. 1974. symmetric distributions: survey, new results and applications,” IEEE [2] E. J. Kelly, “An adaptive detection algorithm,” IEEE Trans. Aerosp. Trans. Signal Process., vol. 60, no. 11, pp. 5597–5625, Nov. 2012. Electron. Syst., vol. 22, no. 1, pp. 115–127, Mar. 1986. [28] A. Wiesel, “UniÞed framework to regularized covariance estimation in [3] E. J. Kelly, “Adaptive detection in non-stationary interference, Part scaled Gaussian models,” IEEE Trans. Signal Process., vol. 60, no. 1, III,” Massachusetts Inst. of Technol., Lincoln Lab., Lexington, MA, pp. 29–38, Jan. 2012. USA, Tech. Rep. 761, 1987. [29] Y. I. Abramovich and O. Besson, “Regularized covariance matrix [4] E. J. Kelly and K. M. Forsythe, “Adaptive detection and estimation in complex elliptically symmetric distributions using the estimation for multidimensional signal models,” Massachusetts Inst. of expected likelihood approach—Part 1: The oversampled case,” IEEE Technol., Lincoln Lab., Lexington, MA, USA, Tech. Rep. 848, 1989. Trans. Signal Process., vol. 61, no. 23, pp. 5807–5818, 2013. [5] S. Kraut and L. L. Scharf, “The cfar adaptive subspace detector is a [30] Y. I. Abramovich and B. A. Johnson, “GLRT-based detection-esti- scale-invariant GLRT,” IEEE Trans. Signal Process., vol. 47, no. 9, mation for undersampled training conditions,” IEEE Trans. Signal pp. 2538–2541, Sep. 1999. Process., vol. 56, no. 8, pp. 3600–3612, Aug. 2008. [31] B. Johnson and Y. Abramovich, “GlRT-based outlier prediction and Olivier Besson (SM’04) received the M. S. and PhD degrees in signal pro- cure in under-sampled training conditions using a singular likelihood cessing in 1988 and 1992 respectively, both from Institut National Polytech- ratio,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. nique, Toulouse. He is currently a Professor with the Department of Electronics, (ICASSP), Honolulu, HI, USA, Apr. 15–20, 2007, pp. 1129–1132. Optronics and Signal of ISAE (Institut Supérieur de l’Aéronautique et de l’Es- [32] Y. Abramovich and B. Johnson, “Use of an under-sampled likelihood pace), Toulouse. ratio for GLRT and AMF detection,” in Proc. IEEE Conf. Radar, Apr. His research interests are in the area of robust adaptive array processing, 24–27, 2006, pp. 539–545. mainly for radar applications. Dr. Besson is a member of the Sensor Array and [33] Y. Abramovich and B. Johnson, “A modiÞed likelihood ratio test for Multichannel technical committee (SAM TC) of the IEEE Signal Processing detection-estimation in under-sampled training conditions,” in Proc. Society. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), Toulouse, France, May 14–19, 2006, pp. 1105–1108. [34] D. E. Tyler, “Statistical analysis for the angular central Gaussian dis- tribution on the sphere,” Biometrika, vol. 74, no. 3, pp. 579–589, Sep. Yuri I. Abramovich (M’96–SM’06–F’08) received the Dipl. Eng. (Honors) de- 1987. gree in radio electronics in 1967 and the Cand. Sci. degree (Ph.D. equivalent) [35] Y. Chikuse, on Special Manifolds. New York, NY, USA: in theoretical radio techniques in 1971, both from the Odessa Polytechnic Uni- Springer-Verlag, 2003. versity, Odessa (Ukraine), U.S.S.R., and in 1981, he received the D.Sc. degree [36] D. E. Tyler, “A distribution-free M-estimator of multivariate scatter,” in radar and navigation from the Leningrad Institute for Avionics, Leningrad Ann. Statist., vol. 15, no. 1, pp. 234–251, Mar. 1987. (Russia), U.S.S.R. From 1968 to 1994, he was with the Odessa State Poly- [37] M. Siotani, T. Hayakawa, and Y. Fujikoto, Modern Multivariate technic University, Odessa, Ukraine, as a Research Fellow, Professor, and ul- Statistical Analysis. Cleveland, OH, USA: Amer. Science Press, timately as Vice-Chancellor of Science and Research. From 1994 to 2006, he 1985. was at the Cooperative Research Centre for Sensor Signal and Information Pro- [38] Y. I. Abramovich and N. K. Spencer, “Diagonally loaded normalised cessing (CSSIP), Adelaide, Australia. From 2000, he was with the Australian sample matrix inversion (LNSMI) for outlier-resistant adaptive Þl- Defence Science and Technology Organisation (DSTO), Adelaide, as principal tering,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. research scientist, seconded to CSSIP until its closure. As of January 2012, Dr. (ICASSP), Honolulu, HI, USA, Apr. 2007, pp. 1105–1108. Abramovich is with W.R. Systems Ltd., Fairfax, Virginia USA. [39] Y. Chen, A. Wiesel, and A. O. Hero, “Robust shrinkage estimation of His research interests are in signal processing (particularly spatio-temporal high-dimensional covariance matrices,” IEEE Trans. Signal Process., adaptive processing, beamforming, signal detection and estimation), its applica- vol. 59, no. 9, pp. 4097–4107, Sep. 2011. tion to radar (particularly over-the-horizon radar), electronic warfare, and com- [40] Y. I. Abramovich, N. K. Spencer, and M. D. E. Turley, “Time- munications. Dr. Abramovich was Associate Editor of IEEE TRANSACTIONS ON varying autoregressive (TVAR) models for multiple radar observa- SIGNAL PROCESSING from 2002 to 2005. Since 2007, he has served as Associate tions,” IEEE Trans. Signal Process., vol. 55, no. 4, pp. 1298–1311, Editor of IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS and Apr. 2007. is currently a member of the IEEE AESS Board of Governors.