<<

arXiv:1501.06027v2 [cs.IT] 14 May 2015 h SMihrt t anmjrlmtto fexhibiting Thi of outliers. limitation contain major observations [5], (SCM), when main matrix Carlson performances its covariance poor and sample inherits Abramovich the RSCM of of the derivative works be a can As the regularized which [6]. to the approach back loading of diagonal traced use the fundamentally from the RSCM originates The by (RSCM). matrix given covariance regularized sample matri is One [5]. covariance method proposed regularized estimation been have dimensions), samples methods of their estimation detection (number samples than the order in In less enhancing deficiency therein). possible references to a and support key [4] sufficient [3], a is (see that performance matrix dete well-acknowledged scatter is In it [2]. accurate [1], instance, which systems for (STAP) radar tion of processing design adaptive the underlies space-time for portance o settings conventional g over the parameter. illustrate design to regularization optimal efficienc order proposed in the the fa provided of support are constant which method under proposed results the probability Simulation param detection rates. regularization the alarm the for together maximizes detect design grow a dimension ANMF that propose their the we and infinity, samples of to of statistics number the the when of the fluctuations asymptotic whe by loss caused little degradation inducing Gaussian. while the is noises mitigating t impulsive in is of choice presence efficient invari this is behind spherically regular rationale RTE non-Gaussian the The for and clutters. th distributed distribution (RTE) Gaussian namely estimator a estimators, Tyler ANMF follows regularized the clutter appropriate of the of (RSCM), matrix design kinds covariance sample the two regularized consider for we detector behin specifically, motivation More thus major the have work. constitutes answers This convincing provided. paramete no been regularization which for optimal covar question the true difficult of the setting from The deviate significantly matrix. increas to it an cause however While tha estimates. samples regularizatρ data covariance secondary positive sample of the number a traditional of limited a than eigenvalues with greater problems the be construction to parameter by estimates t force scatter est of regularized which consider estimation we ad methods paper, in radar this the step In for fundamental detection. that a radar (ANMF) is well-acknowledged matrix filter covariance is noise-clutter matched It normalized detection. adaptive the of zdMce itr outdtcin admMti Theory Matrix Random detection, design. robust Optimal Filter, Mached ized em oipoetecniinn fteetmt,i might it estimate, the of conditioning the improve to seems h siaino cte arcsi ffnaetlim- fundamental of is matrices scatter of estimation The Terms Index the studying results theory matrix random recent on Based Abstract pia eino h dpieNormalized Adaptive the of Design Optimal Ti ril drse mrvmnso h design the on improvements addresses article —This ρ hsmkste oesial o ihdimensional high for suitable more them makes This . RglrzdTlrsetmtr dpieNormal- Adaptive estimator, Tyler’s —Regularized .I I. baKmon oanCule,F´dei acl Mohamed Fr´ed´eric Pascal, Couillet, Romain Kammoun, Abla NTRODUCTION ace itrDetector Filter Matched imation a the hat aptive our d sa is r the n when iance the f of e of y ized eter ant ain ion far lse he or to c- ly n x e s , h eaiu fiscrepnigfleaamaddetectio and alarm characteriz false to corresponding essential its is of appropriate it an detector, behaviour with the ANMF the up the for come for to referred replacement order design be a In will ANMF-RSCM. as which as scheme RSCM to a the matrix, covariance uses unknown correla thus place the Gaussian and over in first operates clutters consider used detector are will the We RTE where matrix. the scenario covariance or unknown RSCM the the th of where in cases interest the Of are observations. (i.i.d.) distributed cally ie siae[6,wihi optdbsdo secondary on based i.e., computed observations, is which data by [26], matrix estimate covariance given the a statis (NMF) the filter in matched replacing normalized by the obtained fals covaria is and constant detector power scheme This clutter of matrix. the this features to respect interesting [25]–[27], with the property in alarm enjoy analyzed to shown and was [24] First detection. adaptive by radar the for introduced of (ANMF) detection design filter the to matched work true applied normalized choice this the in when these consider performances whether We and problems. good clear SCM for not allow the thus distance RTE. will is or a the It RTE on matrix. for covariance based the [23] merely between [15], being minimization in of have denominator and works these common RSCM expressions, different the yielding Although investigate for been [22] essentially [21], has regulariz question the setting This of parameter. way clever a determine to essential hrb aigi oesial ordrapiain,for for applications, models outliers Gaussian radar [17]–[20]. of out to clutter robust rules presence the suitable the evidence the more experimental of which derivative to it a making resilient as [13]–[16]. thereby is RTE, works estimator, the several Tyler’s RSCM, the in of the which established convergence Unlike findings been the major as recently two well have are as algorithm RTE dimensions), recursive their the associated than of less existence samples the of to (number for undefined samples are few methods covariance [12 robust estimator (RTE). conventional Tyler estimator Tyler case robust regulariza regularized so-called the the complex the to case, yielding the applied Gaussian been to the has [8]–[10], to Ollila technique Similar Maronna by [11]. [4], and recently [3], Hampel, more ha extended Huber, matrices and covarianc by scatter much of of proposed class matrix estimators been a covariance robust issue, coined the this estimators tackle estimating To inherent of challenging. observat The more task atypical produce [7]. the to Kelker makes distributions distributions these by symmetric of introduced nature elliptical originally complex (CES), observati from that drawn assuming by are modeled often is scenario latter sfra euaie siainmtosaecnend ti it concerned, are methods estimation regularized as far As n Si Alouini -Slim inlfe needn n identi- and independent free swork is While i of tic ation in d ions tion ons nce the ted ve ly ], o n e e e s s 1 , 2 probabilities. where y CN represents the vector received by an N- Under the assumption of fixed dimensions, such a task dimensional∈ array of sensors, x stands for the noise clutter and seems to be out of reach. This has led us to consider in [28] α is a complex scalar modeling the unknown target amplitude. the regime wherein the number of secondary data samples The signal detection problem is phrased as the following and their dimensions grow simultaneously to infinity, thereby binary hypothesis test: allowing for the use of advanced tools from random matrix H : y = αp+x theory. The asymptotic false alarm probability for the ANMF- 1 (1) H : y = x. RTE was in particular derived in [28] (yet only as a mere  0 application example of the main mathematical result.). In order Several models for the clutter x have been proposed. Among to allow for an optimal choice of the regularization parame- them, we distinguish the class of CES random variates ter, these results need to be augmented with an asymptotic which encompass most of the commonly encountered random analysis of the detection probability. This constitutes the main models, including the standard Gaussian distribution, the K- contribution of our work. In particular, we extend the results distribution, the Weibull distribution and many others [3]. A of [28] to the ANMF-RSCM detector by showing that its CES distributed random variable is given by: corresponding statistic flucutates as a Rayleigh distribution 1 2 when no target is present and, additionally, establish that it x = √τCN w˜ behaves like a Rice distributed random variable otherwise. where τ is a positive scalar random variable called the texture, Based on the asymptotic characterization of these distributions, C is the covariance matrix1 and w˜ is an N-dimensional we propose an optimal setting of the regularization parameter N vector independent of τ, zero-mean unitarily invariant with that maximizes the asymptotic probability of detection for any 1 norm w˜ = √N. The quantity C 2 w˜ is referred to as given false alarm probability. k k N In a second part, we consider the case where the clutter speckle. The design of an appropriate statistic to the above is drawn from heavy tailed distributions. It is thus natural to hypothesis test depends on the amount of knowledge that is assume that the detector uses the RTE, since the RSCM is available to the detector. If the clutter is Gaussian with known vulnerable to the presence of outliers and may provide poor CN while α is unknown, the Generalized performances. This scheme will be coined ANMF-RTE. By Likelihood Ratio (GLRT) for the detection problem in (1) exploiting recent results on the asymptotic behavior of the results in the following test statistic: RTE estimator [28], [29], we prove that, up to a certain change 1 y∗CN− p of variable, the ANMF-RTE is asymptotically equivalent to TN = y C 1yp C 1p the ANMF-RSCM when operating over Gaussian clutters. ∗ N− ∗ N− This argues in favor of the role of the RTE to retrieve the which corresponds to theq square-root statistic of the ANMF Gaussian performances while operating over heavy distributed detector. The statistic T has been derived independently clutters. Finally, we prove through simulations the superiority N by several works, thereby leading the corresponding detector of our design to some of the adhoc recent settings of the to have many alternative names: the constant false alarm regularization parameter that have recently been proposed. (CFAR) matched subspace detector (MSD) [30], the nor- The gain of our design likely owes to the high accuracy malized matched filter (NMF) [31], or the Linear Quadratic of the derived asymptotic results in predicting the detection GLRT (LQ-GLRT) [32]. If the clutter is elliptically distributed, probability of the ANMF schemes. optimal detection procedures based on the GLRT principle The remainder of the paper is organized as follows. In the lead to statistics that depend on the distribution of the texture first section, we introduce the considered problem. Then, we τ. Nevertheless, a complete knowledge of the statistics of the propose an optimal design approach for the ANMF-RTE and signal and noise cannot be acquired in practice. A reasonable the ANMF-RSCM. Finally, we illustrate using simulations the hypothesis, largely used in radar detection, is to assume that gain of the proposed design method over conventional settings only p is known while α and the statistics of the noise are ig- of the regularization parameter. nored. To handle this case, the use of T whose optimality has Notations: Throughout this paper, we depict vectors in N only been shown in the Gaussian setting has been advocated lowercase boldface letters and matrices in uppercase boldface as a good detection technique. Such a choice has particularly letters. The notation (.) stands for the transpose conjugate ∗ been motivated by the result of [24] showing the asymptotic while tr(.) and (.) 1 are the trace and inverse operators. The − optimality of T , when N becomes increasingly large, under notation . stands for the Euclidean norm for vectors and for N the setting of compound-Gaussian distributed clutters. From spectral normk k for matrices. The arrow a.s. designates almost the expression of T , it can be seen that the detector is only sure convergence. The statement X ,−→Y defines the new N required to know C up to a scale factor, which is much less notation X as being equal to Y . N restrictive than the requirement of optimal detection strategies. Since the covariance matrix CN is unknown in practice, II. PROBLEM STATEMENT a popular approach consists in replacing in TN the unknown We consider the problem of detecting a complex signal covariance matrix CN by an estimate built on signal free i.i.d. vector p corrupted by an additive noise as: 1Note that when the second order statistics exist, the scatter matrix is equal y = αp+x to the covariance matrix (up to a constant). 3

observations x1, , xn, termed secondary data. The result- of how should the regularization parameter ρ be set naturally ing detector is called··· the adaptive normalized matched filter arises. Recent previous works dealing with this issue propose (ANMF). Several concurrent estimators of CN can be used. to set ρ in such a way as to minimize a certain mean-squared- The most popular one is the traditional sample covariance error between Cˆ N and CN [?], [15]. While easy-to-compute matrix (SCM) given by: estimates of these values of ρ were provided, one of the n major criticism to these choices is that they are performed 1 R = x x∗ regardless of the application under consideration. In particular, N n i i i=1 a more relevant choice to the application under study consists X which corresponds tob the Maximum-Likelihood estimator in selecting the values of ρ that maximize the probability of (MLE) if the clutter is Gaussian distributed. However, in some detection while keeping fixed the false alarm probabilities. scenarios where the available number of observations n is These values will be considered as optimal in regards of radar of the same order or smaller than N, the SCM, being ill- detection applications. conditioned, will not lead to accurate detection results2. A To this end, one needs to characterize the distribution of RSCM RTE practical approach that has received considerable attention is TN (ρ) and TN (ρ) given by: to regularize the SCM, thereby yielding the regularized SCM 1 y∗RN− (ρ)p (RSCM) given by: b RSCMb TN (ρ)= (4) 1 1 RN (ρ)=(1 ρ)RN +ρIN , (2) y∗RN− (ρb)y p∗R N− (ρ)p − b q q1 where the parameter ρ [0, 1] serves to give more or less y∗CN− (ρ)p b ∈ b RTE b b importance to the sample covariance matrix RN depending TN (ρ)= (5) 1 1 y C− (ρ)y p C − (ρ)p on the available number of samples. The ANMF that uses the ∗ N b ∗ N b RSCM as a plug-in estimator of CN will beb referred to as under hypotheses H andqH . For fixedqN and n, this is not an 0 1b b ANMF-RSCM. easy task and in our opinion would not lead, if ever feasible, While this regularization artifice has revealed efficient in to easy-to-compute expressions for the optimal values of ρ. In handling the scarcity of the available samples, it has the serious this paper, we relax this restrictive assumption by considering drawback of fundamentally relying on the SCM. In effect, the N the case where N and n go to infinity with n c (0, ). SCM, even though suitable for Gaussian settings, is known to This in particular enables leveraging the recent→ results∈ of [28]∞ be vulnerable to outliers and thus leads to highly inefficient that will be reviewed in Section IV-A. estimators when the samples are drawn from heavy tailed non- Gaussian distributions. A standard alternative to conventional III. OPTIMALDESIGNOFTHE ANMF-RSCM DETECTOR: sample covariance estimates is constituted by the class of GAUSSIAN CLUTTER CASE robust-scatter estimators, known for their resilience to atypical observations. The robust estimator that will be considered in In this section, we consider the case of a Gaussian clutter. In ˆ other words, we assume that all the secondary data x1, , xn this work was defined in [14] as the unique solution CN (ρ) ··· to: are drawn from Gaussian distribution with zero-mean and co- n variance C . For ρ , , we define the RSCM as in (2) and 1 xixi∗ N (0 1] Cˆ N (ρ)=(1 ρ) +ρIN . (3) ∈ RSCM − n 1 ˆ 1 corresponding statistic TN . In order to pave the way to- i=1 N xi∗CN− (ρ)xi X wards an optimal setting of the regularization coefficient ρ, we n with ρ max 0, 1 N , 1 . This estimator corresponds need to characterize theb asymptotic false alarm and detection ∈ − N to a hybrid robust-shrinkage estimator reminding Tyler’s M- probabilities under the assumptions that cN , c. That is,   n → estimator of scale [12] and Ledoit-Wolf’s shrinkage estimator provided H0 or H1 is the actual scenario, (y = x or y = αp+ [21]. We will thus refer to it as the Regularized-Tyler Estimator P RSCM x), we shall evaluate the probabilities TN > Γ H0 (RTE). Besides its robustness, the RTE has many interesting | P RSCM h i , N and TN > Γ H1 for Γ > 0. Before going further, features. First, it is well-suited to situations where cN n is | b large while standard robust scatter estimates are ill-conditioned we needh to stress that somei extra assumptions on the order or even undefined if N >n. By varying the regularization of magnitudeb of α and Γ with respect to N should be made parameter ρ, one can move from the unbiased Tyler-estimator to avoid getting trivial results. Indeed, it appears that under 1 1 p 1 1 H , the random quantities y R− (ρ) , y R− (ρ)y, [34] (ρ = 0) to the identity matrix (ρ = 1) which represents 0 √N ∗ N p N ∗ N a crude guess for the unknown covariance C . Its relation 1 p k k N and p∗RN− (ρ) p 2 are standard objects in random matrix to the Tyler’s estimator has recently been reported in [15] by theory, which convergek k almost surelyb to their meansb when viewing it as the solution of a penalized M-estimation cost both N band n grow to infinity with the same pace [35]. As 1 1 a.s. RSCM a.s. function. We will denote by ANMF-RTE the ANMF detector a result, since y∗R− (ρ)p 0, T 0 for √N N −→ N −→ that uses the RTE instead of the unknown covariance matrix. all Γ > 0, which does not allow to infer much information Upon replacing in TN the unknown covariance matrix by a about the false alarm probability.b It turnsb out that the proper 1 regularized estimate, be it the SCM or the RTE, the question scaling of Γ should be Γ = N − 2 r for some fixed r > 0, an assumption already considered in [28]. Similarly, one can 2Traditionally, it is assumed that 2N observations are required to ensure good performances of the sub-optimal filtering, i.e., a 3 dB loss of the output see that under H1, the presence of a signal component in RSCM SNR compared to optimal filtering [33]. y causes TN to converge almost surely to some positive

b 4

1 constant if α does not vary with N. Therefore, for Γ= N − 2 r, where: P RSCM TN > Γ H1 1. In order to avoid this trivial 2 | → 1 p∗CN Q (ρ)p 1 σ2 (ρ) , N statement,h we shalli assume that α = N − 2 a for some fixed N,SCM 1 2 p∗QN (ρ)p tr CN QN (ρ) a >b with p N. In practice, this means that the N 0 = 1 dimension of thek k array is sufficiently large to enable working 2 2 1 2 2 × 1 c(1 ρ) mN ( ρ) tr C Q (ρ) in low-SNR regimes. − − − N N N Prior to introducing the results about the false alarm and 1 and QN (ρ) , (IN +(1 ρ)mN ( ρ)CN )− . detection probabilities, we shall introduce the following as- − − sumptions and notations: The uniformity over ρ of the convergence result in The- 1 2 orem 1 is essential in the sequel. It obviously implies the Assumption A-1. For i 1, ,n , xi = C wi with: ∈{ ··· } N pointwise convergence for each ρ> 0 but, more importantly, it w , , wn are N 1 independent standard Gaussian • 1 ··· × will allow us to handle the convergenceof the false alarm prob- random vectors with zero-mean and covariance IN , ability when random values of the regularization parameter N N CN C × is such that lim sup CN < and • are considered. This feature becomes all the more interesting 1 C∈ k k ∞ N tr N =1, knowing that the detector is required to set the regularization 1 lim infN p∗CN p > 0. • N parameter based on random received secondary data. Note that, 1 for technical issues, a set of the form [0,κ), where κ > 0 is Note that the normalization N tr CN =1 is not a restricting constraint since the statistics under study are invariant to the as small as desired but fixed, has to be discarded from the uniform convergence region. scaling of CN . The last item in Assumption 1 is required for technical purposes in order to ensure that the considered The result of Theorem 1 provides an analytical expression for the false alarm probability. Since this expression depends statistic exhibits fluctuations under H0 and H1. In practice, this assumption implies that the steering vector does not lie in on the unknown covariance matrix, it is of practical interest to provide a consistent estimate for it: the null space of the covariance matrix CN . C R Denote for z + by mN (z) the unique complex Proposition 2. For ρ (0, 1), define solution to: ∈ \ ∈ ∗ b −2 p RN (ρ)p m z z c ρ 1 ρ − N ( ) = ( + N (1 ) p∗Rb 1p − − 2 1 − N 1 σˆN,SCM(ρ)= 1 1 − 2 cN ρ 1 ρ 1 tr CN (IN +(1 ρ)mN (z)CN )− 1 cN + N tr RN− (ρ) 1 N tr RN− (ρ) ×N − − −     ∗ b that satisfies (z) (mN (z)) 0 or unique positive if z < 0. 2 2b p RN pb and let σˆN,SCM (1) = limρ 1 σˆN,SCM (ρ) = b . Then, ℑ ℑ ≥ ↑ tr RN The existence and uniqueness of mN (z) follows from standard we have, for any κ> 0, results of random matrix theory [36]. It is a deterministic quantity, which can be computed easily for each z using 2 2 a.s. sup σˆN,SCM(ρ) σN,SCM(ρ) 0. fixed-point iterations. In our case, it helps characterize the ρ SCM − −→ ∈Rκ asymptotic behavior of the empirical spectral measure of the 1 n 3 random matrix (1 ρ) n i=1 xixi∗ . − SCM The proof of Proposition 2 follows along the same lines as Define also for κ> 0, κ as: PR that of Proposition 1 in [28] and is therefore omitted. SCM , [κ, 1] . Rκ We will now derive the asymptotic equivalent for RSCM r With these notations at hand, we are now ready to analyze P T (ρ) > H1 , where under H1 the received vec- N √N | the asymptotic behaviour of the false alarm and detection torh y is supposed to be giveni by: probabilities. The proof for the following Theorem will not b a be provided since, as we shall see in Section IV, it follows H1 : y = p+x directly by applying the same approach used in [28]. √N

Theorem 1 (False alarm probability). As N,n with with x distributed as the xi’s in Assumption 1. The following → ∞ cN c (0, ), results constitute the major contribution of the present work. → ∈ ∞ They will lead in conjunction with those of Theorem 8 and r2 r 2σ2 (ρ) a.s. P RSCM − N,SCM Proposition 2 to the optimal design of the ANMF-RSCM. sup TN (ρ) > H0 e 0 SCM √ | − −→ ρ κ N ∈R   Theorem 3 (Detection probability). As N,n with cN → ∞ → b1 N c, we have for any κ> 0 3Let νˆ = δ be the empirical spectral measure of the N N i=1 λi 1−ρP n ∗ random matrix n i=1 xixi with λ1, , λN the eigenvalues of 1−ρ n ∗ P ··· RSCM r =1 xix . Denote by mˆ N (z) its Stieltjes transform given by P T ρ > H n Pi i sup N ( ) 1 −1 1 N 1 SCM √ | mˆ N (z) = (t z) νˆN (dt) = =1 − . Then, quantity mN (z) is ρ κ N R − N Pi λi z ∈R   the Stieljes transform of a certain deterministic measure µN , (i.e, mN (z)= −1 b r a.s. (t z) µN (dt)) which approximates in the almost sure sense mˆ N (z) Q1 g SCM(p), 0. R − a.s. − σN, (ρ) −→ (i.e., mˆ N (z) mN (z) 0.).  SCM  − −→

5

4 where Q1 is the Marcum Q-function while σN,SCM is given matrix CN which is unknown to the detector. Acquiring a in Theorem 1 and gSCM (p) is given by: consistent estimate of fSCM(ρ) based on the available RN is thus mandatory. This is the goal of the following Proposition. 1 c(1 ρ)2m( ρ)2 1 tr C2 Q2 (ρ) N N N ˆ b gSCM (p)= − − − Proposition 4. For ρ (0, 1), define fSCM(ρ) as: 2 q p CN Q (ρ)p ∈ ∗ N 2 2 1 c 1 p R− (ρ)p (1 ρ) 1 c+ ρ tr R− (ρ) 2 p ∗ N N N a p∗QN (ρ)p . fˆ (ρ)= − − × N | | SCM   1  2  r p R− (ρ)p ρp R− (ρ)p b ∗ N − ∗ N b ˆ , ˆ N and let fSCM(1) limρ 1 fSCM(ρ)= ∗ b . Then, we have: Proof: See Appendix A. ↑ b p RbN p RSCM According to Theorem 1 and Theorem 3, TN (ρ) be- a.s. sup fˆSCM(ρ) fSCM(ρ) 0, haves differently depending on whether a signal is present ρ SCM − −→ or not. In particular, under H , √NT RSCM ρ behaves ∈Rκ 0 N b ( ) like a Rayleigh distributed random variate with parameter where we recall that SCM = [κ, 1]. Rκ σN,SCM(ρ) while it becomes well-approximatedb under H1 by Proof: See Appendix B. a Rice distributed random variable with parameters gSCM(p) Since the results in Proposition 4 and Theorem 3 are and σN,SCM(ρ). It is worth noticing that in the theory of radar uniform in ρ, we have the following corollary: detection, getting a false alarm and a detection probability dis- tributed as Rayleigh and Rice random variables is among the Corollary 5. Let fˆSCM (ρ) be defined as in Proposition 4. simplest cases that one can ever encounter, holding only, to the Define ρˆN∗ as any value satisfying: best of the authors’ knoweldege, if white Gaussian noises are ˆ considered [?, p.188]. We believe that the striking simplicity ρˆN∗ argmax fSCM(ρ) . ∈ ρ SCM of the obtained results inheres in the double averaging effect ∈Rκ n o that is a consequence of the considered asymptotic regime. Then, for every r> 0, This is to be compared to the quite intricate expressions for P √NTN (ˆρ∗ ) > r H the false alarm probability obtained under the classical regime N | 1 of n tending to infinity while N is fixed [40].   a.s. max P √NTN (ρ) > r H1 0. −ρ SCM | −→ We will now discuss the choice of the regularization param- ∈Rκ eter ρ and the threshold r. In accordance with the theory of n  o radar detection, we aim at setting ρ and r in such a way to keep Proof: The proof is similar to that of Corollary 1 of [28] the asymptotic false alarm probability equal to a fixed value and is thus omitted. η while maximizing the asymptotic probability of detection. From Corollary 5, the following design procedure leads to From Theorem 1, one can easily see that the values of r and optimal performance detection results: ρ that provide an asymptotic false alarm probability equal to First, setting the regularization parameter to one of the η should satisfy: • values maximizing fˆ (ρ): r SCM = 2 log η. σ ρ ˆ N,SCM( ) − ρˆN∗ argmax fSCM(ρ) (6) p ∈ ρ SCM From these choices, we have to take those values that max- ∈Rκ n o imize the asymptotic detection which is given, according to Second, selecting the threshold rˆ as: • Theorem 3, by: rˆ =σ ˆN,SCM(ˆρN∗ ) 2 log η (7) r − Q g (p), . 1 SCM σ (ρ) p  N,SCM  IV. OPTIMALDESIGNOFTHE ANMF-RTE: The second argument of Q1 should be kept fixed in order NON-GAUSSIAN CLUTTER to ensure the required asymptotic false alarm probability. As This section discusses the design of the ANMF-RTE de- the Marcum-Q function increases with respect to the first tector in the case where the clutter is non-Gaussian. In argument, the optimization of the detection probability boils particular, we assume that the secondary observations satisfy down to considering the following values of ρ: the following assumptions:

1 ρ argmax fSCM(ρ) 2 Assumption A-2. For i 1, ,n , xi = √τiC wi = ∈ { } ∈ { ··· } N where: √τizi where 1 2 f (ρ) , g (p) w , , wn are N 1 independent unitarly invariant SCM 2 SCM • 1 2a ··· × 2 complex zero-mean random vectors with wi = N, However, the optimization of f ρ is not possible in prac- N N k k SCM( ) CN C × is such that lim sup CN < and • tice, since the expression of fSCM (ρ) features the covariance 1 ∈ k k ∞ N tr CN =1. 2 2 τi > 0 are independent of wi. 4 +∞ x +a • Q1(a, b) = x exp 2 I0(ax)dx where I0 is the zero-th 1 Rb −  lim inf p∗CN p > 0. order modified Bessel function of the first kind. • N 6

The random model described in Assumption 2 is that of CES for each ǫ > 0, the above convergence does not suffice distributions which encompass a wide range of observation to obtain the convergence of most of the commonly used 1 1 distributions obtained for different settings of the statistics of functionals which involve fluctuations of order N − 2 or N − τi. Prior to stating our main findings, we shall first review (e.g. quadratic forms of Cˆ N (ρ) or linear statistics of the eigen- some recent results concerning the asymptotic behaviour of values of Cˆ N (ρ)). While a further refinement of the above the RTE in the asymptotic regime. convergence seems to be out of reach, it has recently been established in [28] that the fluctuations of special functionals can be proved to be much faster, mainly by virtue of an A. Background averaging effect which cancels out terms fluctuating at lower This section reviews the recent results in [28] about the ˆ k speed. In particular, bilinear forms of the type a∗CN (ρ)b were asymptotic behaviour of the RTE estimator. studied in [28], where the following proposition was proved: Recall that the RTE is defined, for ρ max 0, 1 n , 1 , ∈ − N Proposition 7. Let a, b CN with a = b = 1 as the unique solution to the following equation: ∈ k k k k   deterministic or random independent of x1, , xn. Then, as n ··· 1 xixi∗ N,n , with cN c (0, ), for any ǫ > 0 and every Cˆ N (ρ)=(1 ρ) +ρIN . → ∞ → ∈ ∞ − n 1 ˆ 1 k Z, i=1 N xi∗CN− (ρ)xi ∈ X 1 ǫ ˆ k ˆk a.s. sup N − a∗CN (ρ)b a∗SN (ρ)b 0. The study of the asymptotic behaviour of robust-scatter esti- ρ RTE − −→ mators is much more challenging than that of the traditional ∈Rκ RTE Z sample covariance matrices. The main reasons are that, first, where κ is defined as in Theorem 6, where k in any R ˆ ˆ ∈ robust estimators of scatter do not have closed-form expres- power of the matrices CN and SN . sions and, second, the dependence between the outer-products Some important consequences of Proposition 7 need to be involved in their expressions is non-linear, which does not stated. First, we shall recall that, while the crude study of the allow for the use of standard random matrix analysis. In ˆ k random variates a∗CN (ρ)b seems to be intractable, quadratic order to study this class of estimators, new technical tools ˆk forms of the type a∗SN (ρ)b are well-understood objects based on different rewriting of the robust-scatter estimators whose behavior can be studied using standard tools from have been developed by Couillet et al. [29], [37], [38]. The random matrix theory [39]. It is thus interesting to transfer important advantage of these techniques is that they suggest to ˆ k ˆk the study of the fluctuations of a∗CN (ρ)b to a∗SN (ρ)b. replace robust estimators by asymptotically equivalent random 1 Proposition 7 achieves this goal by taking ǫ < 2 . Not only matrices for which many results from random matrix theory ˆ k 1 does it entail that a∗CN (ρ)b fluctuates at the order of N − 2 are applicable. In particular, the RTE estimator defined above ˆk (since so does a∗SN (ρ)b) but also it allows one to prove that has been studied in [28] and has been shown to behave in the ˆ k ˆk a∗CN (ρ)b and a∗SN (ρ)b exhibit asymptotically the same regime where N,n in such a way that cN c (0, ) fluctuations. Similar to [28], our concern will be rather focused ˆ → ∞ → ∈ ∞ similar to SN (ρ) given by: on the case k = 1. In the next section, we will show how this − n result can be exploited in order to derive the receiver operating ˆ 1 1 ρ 1 SN (ρ)= − zizi∗ +ρIN , (8) characteristic (ROC) of the ANMF-RTE detector. γN (ρ) 1 (1 ρ)cN n − − i=1 X B. Optimal design of the ANMF-RTE detector where γN (ρ) is the unique solution to: As explained above, in order to allow for an optimal t ν dt . design of the ANMF-RTE detector, one needs to characterize 1= N ( ) RTE γN (ρ)ρ+(1 ρ)t the distribution of TN (ρ) under hypotheses H0 and H1. Z − RTE More specifically, the following theorem applies: Using Proposition 7, we know that the statistic TN (ρ) which cannot be handledb directly, has the same fluctuations Theorem 6 ( [29]). For any κ > small, define RTE , RTE ˆ ˆ 0 κ as TN (ρ) obtained by replacing CN (ρ) by SN (ρ).b That is: 1 R κ+max(0, 1 c− ), 1 . Then, as N,n with cN c − → ∞ → ∈ ˆ 1 y∗SN− (ρ)p (0, ), we have: e RTE  ∞  TN (ρ)= . . 1 1 ˆ ˆ a s p Sˆ− (ρ)p y Sˆ − (ρ)y sup CN (ρ) SN (ρ) 0. ∗ N ∗ N ρ RTE − −→ ∈Rκ e q q where SˆN (ρ) is given by (8). 1 1 1 ρ − Let ρ = ρ ρ+ − . Then, SˆN (ρ) = γN (ρ) 1 (1 ρ)c Theorem 6 establishes a convergence in the operator norm − − 1 of the difference Cˆ N (ρ) SˆN (ρ). This result allows one to ρρ− RN (ρ), where, with a slight abuse of notation, we − 1 n transfer the asymptotic first order analysis of many functionals denote by RN (ρ) the matrix (1 ρ) ziz∗ +ρIN . Since − n i=1 i of Cˆ N (ρ) to SˆN (ρ). However, when it comes to the study T RTEb(ρ) remains unchanged after scaling of Sˆ (ρ) and y, N P N of fluctuations, this result is of little help. Indeed, although we also have:b Theorem 6 can be easily refined as e 1 1 y∗RN− (ρ)p RTE √τ 1 a.s. T ρ 2 ǫ ˆ ˆ N ( )= sup N − CN (ρ) SN (ρ) 0. 1 1 1 ρ RTE − −→ p∗R N− (ρ)pb τ y∗R N− (ρ)y ∈Rκ e q q b b 7

where τ =1 under H0. It turns out that, conditionally to τ, the and resorting to the dominated convergence theorem. RTE fluctuations of the robust statistic TN (ρ) under H0 or H1 Similar to the Gaussian case, we need to build consistent 2 are the same as those obtained in Theorem 1 and Theorem 3 estimates for σN,RTE(ρ) and fRTE(ρ) given by: a 5 once a is replaced by and ρ bybρ . As a consequence, we τ √τ f (ρ)= g2 (p) have the following results: RTE 2a2 RTE Theorem 8 (False alarm probability, [28]). As N,n A consistent estimate for σ2 (ρ) was provided in [28]: → ∞ N,RTE with cN c (0, ), → ∈ ∞ Proposition 10 (Proposition 1 in [28]). For ρ 2 1 ∈ r max( 0, 1 c− , 1 . Define, RTE r 2σ2 (ρ) N sup P T (ρ) > H e− N,RTE 0, − N 0 ∗ − RTE √ ˆ 2 ρ κ N | − → p CN (ρ)p ∈R     1 ρ − p∗Cˆ 1(ρ)p 2 1 − N where ρ ρ isb the aforementioned mapping and σˆN,RTE(ρ)= 7→ 2 (1 cN +cN ρ) (1 ρ) p C Q2 (ρ)p − − 2 , 1 ∗ N N σN,RTE(ρ) 1 2 p∗QN (ρ)p N tr CN QN (ρ) 2 , 2 and let σˆN,RTE(1) limρ 1 σˆN (ρ). Then, we have: 1 ↑ a.s. 2 2 1 2 2 sup σ2 (ρ) σˆ2 (ρ) 0. × 1 c(1 ρ) m( ρ) N tr CN QN (ρ) N,RTE N,RTE − − − ρ RTE − −→ ∈Rκ , 1 with QN (ρ) IN +(1 ρ)m( ρ)CN − .  Similar to the Gaussian clutter case, acquiring a consistent − − estimate for f (ρ) is mandatory for our design. We thus Theorem 9 (Detection probability). As N,n with cN RTE c (0, ), → ∞ → prove the following Proposition: ∈ ∞ 1 r Proposition 11. For ρ max 0, 1 c− , 1 , let P RTE N sup TN (ρ) > H1 ∈ − RTE √ | 2 ρ κ N 1  1  ∈R   fˆ (ρ)= p Cˆ − (ρ)p tr Cˆ (ρ) ρ RTE ∗ N N N E b r − Q1 gRTE(p), 0,    2  − σN,RTE(ρ) → (1 cN +cN ρ)    − ˆ 1 ˆ 2 where the expectation is taken over the distribution of τ, × p∗C− (ρ)p ρp∗C− (ρ)p N − N σN,RTE(ρ) has the same expression as in Theorem 8 and and fˆRTE , limρ 1 fˆRTE(ρ). Then, we have: ↑ 2 1 C2 Q2 1 c(1 ρ) m( ρ) N tr N N (ρ) ˆ a.s. g (p)= − − − sup fRTE(ρ) fRTE(ρ) 0. RTE RTE − −→ q 2 ρ κ p CN Q (ρ)p ∈R ∗ N q 1 2 Proof: The proof follows by first replacing R− (ρ) by a p∗QN (ρ)p . N × Nτ 1 r RN− (ρ) and ρ by ρ in the results of Proposition 4 and using and Q1 is the Marcum Q-function. the convergences [28]: b b Proof: Since the fluctuations of the robust statistic Cˆ N (ρ) a.s. RTE RSCM sup R (ρ) 0. T (ρ) is the same as that of T (ρ) when a is replaced 1 N N N ρ RTE tr Cˆ N (ρ) − −→ by a , we have for any fixed τ, ∈Rκ N √τ ˆ b a.s. b b sup ρCN (ρ) ρRN (ρ) 0. r RTE − −→ P RTE ρ κ sup TN (ρ) > H1, τ ∈R ρ RTE √N | ∈Rκ   and b ρ 1 . . b r a.s. tr Cˆ (ρ) a s 0. Q1 g RTE(p), 0. N − σ (ρ) −→ ρ − N −→  N,RTE 

The result thus follows by noticing the following inequality Since the results in Proposition 11 and Theorem 9 are P RTE r sup TN (ρ) > H1 uniform in ρ, we have the following corollary: ρ RTE √N | ∈Rκ   ˆ Corollary 12. Let fRTE(ρ) be defined as in Proposition 4. b r E Q1 gRTE(p), Define ρˆN∗ as any value satisfying: − σN,RTE(ρ)    ˆ r ρˆN∗ arg max fRTE(ρ) . E P T RTE ρ > H , τ ∈ ρ RTE sup N ( ) 1 ∈Rκ ≤ ρ RTE √N | n o κ   ∈R Then, for every r> 0, b r Q1 gRTE( p), P − σ (ρ) √NTN (ˆρN∗ ) > r H1  N,RTE  | . . 5   a s Note that vector y can be assumed to be Gaussian without impacting the max P √NTN (ρ) > r H1 0. RTE √ RTE −ρ κ | −→ asymptotic distributions of NTN under H0 and H1. ∈R b n   o 8

Using the same reasoning as the one followed in the Gaus- sian clutter case, we propose the following design strategy: First, set the regularization parameter to one of the values Proposed design • maximizing fˆRTE(ρ): Theory Design using ρˆchen and rˆchen [22] ρˆ∗ arg max fˆRTE(ρ) ; Design without regularization ρˆ =0 N RTE ∈ ρ κ ∈R n o 1 Second, set the threshold to rˆ a = 0.5 •

rˆ =σ ˆN,RTE(ˆρ∗ ) 2 log η 0.8 N − p where η is the required false alarm probability. a = 0.25 0.6

V. NUMERICAL RESULTS 0.4 A. Gaussian clutter

In a first experiment, we consider the scenario where the Detection Probability clutter is Gaussian with covariance matrix C of Toeplitz 0.2 N = 0 1 form: a . j i b − i j 0 [CN ]i,j = i j ≤ , b ]0, 1[ , (9) 0 0.1 0.2 0.3 0.4 b − ∗ i > j | | ∈  False Alarm Probability where we set b = 0.96 N = 30 and n = 60. The steering vector p is given by Fig. 1. ROC curves of ANMF-RSCM designs for a = 0.1, 0.25, 0.5, p = p = a(θ) (10) a(θ) with θ = 20o, N = 30, n = 60: Gaussian setting

πk sin(θ) where θ [a(θ)]k = e− . In this experiment, θ is set to 207→o. For each Monte Carlo trial, the simulated data consists of y = αp+x and the secondary data y1, , yn ˆ ··· which are used to estimate ρˆN∗ and to compute RN (ˆρN∗ ). In particular, the shrinkage parameter and the threshold value are Proposed design determined using (6) and (7). We have observed from the Theory considered numerical results that fˆ (ρ) is unimodal and SCM Design using ρˆ and rˆ [22] thus the maximum can be obtained using efficient line search chen chen Design without regularization ρ methods. For comparison, we consider two other designs: the ˆ =0 first one is based on the regularization parameter derived in 1 the work of Chen et al. [22, Equation (19)] (we denote by ρˆchen the corresponding coefficient) while the second one corresponds to the non-regularized ANMF (ρˆ =0). . In order 0.8 to satisfy the required false alarm probability, we assume for the first design that the threshold rˆchen is given by: rˆchen = σˆN,SCM (ˆρchen)√ 2 log η. For the non-regularized ANMF to satisfy the false alarm− probability, the threshold value is set 0.6 based on Equation 11 in [27]. Figure 1 represents the ROC curves of both designs for different values of a = α√N, namely a = 0.1, 0.25, 0.5, along with that of the theoretical Detection Probability 0.4 performances of our design. We note that for all SNR ranges, the proposed algorithm outperforms the design based on the regularization parameter ρˆchen and the gain becomes higher 0.2 as a increases. It also outperforms the non-regularized ANMF 2 2 2 2 2 0 1 10− 2 10− 3 10− 4 10− 5 10− detector. Moreover, the performances of the proposed design · · · · · False Alarm Probability correspond with a good accuracy to what is expected by our theoretical results. In order to highlight the gain of the proposed design over Fig. 2. ROC curves of ANMF-RSCM designs for a = 0.9, p = a(θ) with o the most interesting range of low false alarm probabilities, we θ = 20 , N = 30, n = 60: Gaussian setting represent in Fig. 2 the obtained ROC curves when a =0.8 and the false alarm probability spanning the interval [0.001, 0.05]. 9

B. Non-Gaussian clutter Proposed design In a second experiment, we proceed investigating the per- Design using ρˆolilla, rˆolilla [15] formance of the proposed design in the case where a non- 1 Gaussian clutter is considered. In particular, we consider the a = 0.5 case where the clutter is drawn from a K-distribution with zero mean, covariance CN , and shape ν =0.5 [3]. The covariance 0.8 matrix has the same form as in (9) with b = 0.96 but with N = 30 and n = 60. Similar to the Gaussian clutter case, we consider for the 0.6 sake of comparison the concurrent design based on the reg- a = 0.25 ularization parameter derived in the work of Olilla and Tyler in [15, Equation (19)]. We denote by ρˆollila and rˆollila the 0.4 a = 0.1 corresponding regularization coefficient and threshold. Note Detection Probability that, according to our theoretical analysis, the threshold rˆollila should be set to rˆollila =σ ˆN,RTE(ˆρollila)√ 2 log η in order 0.2 to satisfy the required false alarm probability.− The results are depicted in Figure 3. We note that for all SNR ranges, the 0 0.1 0.2 0.3 0.4 proposed method achieves a gain over the design based on False Alarm Probability the regularization coefficient proposed by Olilla et al. We also observe that, similar to the first experiment, the gain increases as a grows but with a lower slope6. Fig. 3. ROC curves of ANMF-RTE designs for a = 0.1, 0.25, 0.5, p = a(θ) o In a last experiment, we investigate the impacts of a and with θ = 20 , N = 30, n = 60: K distributed clutter setting the distribution shape ν. Figure 4 represents the detection probability with respect to a when the false alarm probability 1 is fixed to 0.05. We note that for small values of a, higher detection probabilities are achieved when the distribution of the clutter is heavy-tailed (small ν), whereas the opposite 0.9 occurs for large values of a. In order to explain this change in behavior, we must recall that heavy-tailed clutters (small ν) are characterized by a higher number of occurrences of 0.8 τ in the vicinity of zero and at the same time more frequent ν =0.1 realizations of large values of τ. If a is small, the improvement ν =0.9 in detection performances achieved by heavy-tailed clutters is ν = 30 attributed to the artificial increase in SNR over realizations of Detection Probability 0.7 small values of τ. As a increases, the power of the signal of interest is high enough so that the effect of realizations with large values of τ becomes dominant. The latter, which are 0.6 more frequent for small values of ν, are characterized by high 0.6 0.8 1 1.2 1.4 1.6 1.8 levels of noises, thereby entailing a degradation of detection a performances. Fig. 4. Detection probability with respect to a, p = a(θ) with θ = 20o, N = VI. CONCLUSION 30, n = 60, ν = 0.1, 0.9, 30: K-distributed clutter, false alarm probability =5%. In this paper, we address the setting of the regularization parameter when the RSCM or the RTE are used in the ANMF detector statistic as a replacement of the unknown covari- grow large simultaneously. Based on tools from random matrix ance matrix, thereby yielding the schemes ANMF-RSCM and theory along with recent asymptotic results on the behaviour of ANMF-RTE. One major bottleneck toward determining the the RTE, we derived the asymptotic distribution of the ANMF regularization parameter that optimizes the performances of detector under hypothesis H0 and H1. The obtained results the ANMF detector, is linked to the difficulty to clearly have allowed us to propose an optimal design of the regu- characterize the distribution of the ANMF statistics under the larization parameter that maximizes the detection probability cases of presence or absence of a signal of interest (H1 and while keeping fixed the false alarm probability through an H0). In order to deal with this issue, we considered the regime appropriate tuning of the threshold value. Simulations results under which the number of samples and their dimensions clearly illustrated the gain of our method over previously proposed empirical settings of the regularization coefficient. 6Note that we do not compare with the zero-regularization case as in the One major advantage of our approach is that, contrary to first first experiment, since, contrary to the Gaussian case, we do not have in our disposal theoretical results allowing the tuning of the threshold to the value intuitions, it leads to simple closed-form expressions that can that achieves the required false alarm probability. be easily implemented in practice. This is quite surprising 10 given that the handling of the classical regime where n grows Arguing in a similar way to that in (11), we know that the 1 1 to infinity with N fixed has been shown to be delicate. As quantity N ap∗RN− p does not fluctuate and converges to: a matter of fact, it has thus far been considered only for the 1 1 a a.s. non-regularized Tyler estimator where intricate expressions in abp∗R− p p∗QN (ρ)p 0, N N − Nρ −→ the form of integrals were obtained [40]. Building the bridge between both approaches is an open question that deserves while, from [28]: b T investigation. 1 1 1 1 (x∗R− p), (x∗R− p) √N ℜ N √N ℑ N   VII. ACKNOWLEDGMENTS 2 1 b p CN Qb (ρ)p ′ ∗ N Z = o (1) Couillet’s work is jointly supported by the French ANR 2 2 2 1 2 2 p −s2ρ N 1 cm( ρ) (1 ρ) tr CN QN (ρ) DIONISOS project (ANR-12-MONU-OOO3) and the GDR − − − N ′ ISIS–GRETSI “Jeunes Chercheurs” Project. Pascal’s work has for some Z (0, I2).  ∼ N T been partially supported by the ICODE institute, research 1 1 1 1 Let r = (x∗RN− p), (x∗RN− p) . Denote by project of the Idex Paris-Saclay. √N ℜ √N ℑ ΥN and ωNhthe quantities: i b b 1 p C Q2 (ρ)p APPENDIX A Υ = ∗ N N N 2 2 2 1 2 2 PROOF OF THEOREM 3 s2ρ N 1 cm( ρ) (1 ρ) N tr CN QN (ρ) a − − − The proof of Theorem 3 consists of two steps. First, we ωN = p∗QN (ρ)p  study the asymptotic behaviour of the detection probability Nρ for fixed ρ. Then, by a similar argument to the one considered Recall the following distance between probability distribu- in [28], we establish the uniformity of the result over the tions: considered set of ρ. Assume that the received signal vector P P˜ P P˜ y is given by: β , = sup fd fd , f BL 1 a − k k ≤ y = p+x   Z Z  √N where f BL = f Lip + f , f Lip being the Lipschitz norm andk k . , thek k supremumk k∞ normk k [?]. Assume for the mo- 2 RSCM with p = N and let us write √NT (ρ) as: k k∞ a N ment that lim sup ΥN < and lim sup p∗QN (ρ)p < . k k ∞ Nρ ∞ 1 1 p The proof for these statements will be provided later. Then, y R− (ρ) √N b∗ N √N √NT RSCM (ρ)= √N . from Theorem 11.7.1 in [?], N ∗ b −1 p R (ρ)p 1 1 N ′ y∗RN− (bρ)y 1 1 N N β r, ap∗R− p , ΥN Z ,ωN 0. b q L N N L → q RSCM     A close inspection of the expressionb of √NTN (ρ) re-   where (X) standsb for the probability distribution of veals that the fluctuations will be governed by the numerator L 1 p X. This in particular establishes that the random vari- y∗R− (ρ) since, from classical results ofb random matrix N √N able r, 1 ap R 1p converges uniformly in distribu- theory, we know that quantities in the denominator exhibit N ∗ N− ′ a deterministicb behaviour, being well-approximated by some tion to ΥN Z ,ωN . From the uniform continuous map- deterministic quantities. In effect, b ping Theorem in [?, Theorem 1], we thus prove that 1 1 1 1 a.s. √N y R p behaves asymptotically as a Rice ran- p R 1 p p Q p N ∗ N− ∗ N− (ρ) ∗ N (ρ) 0, (11) a N − ρN −→ dom variable with location p∗QN (ρ)p and scale Nρ b ∗ 2 p CN Q (ρ)p while: b 1 N . Using this result 2ρ2N 2 2 1 2 2 (1 cm( ρ) (1 ρ) N tr CN QN (ρ)) 1 1 1 a.s. r − − − y∗RN− (ρ)y tr CN QN (ρ) 0. (12) along with Slutsky Lemma, we conclude that under H1, N − Nρ −→ RSCM TN (ρ) is also asymptotically equivalent to a Rice ran- ∗ The first convergence (11) follows from Theorem 1.1 of [41] √p QN (ρ)p b dom variate but with location a and scale √N 1 tr C Q (ρ) whereas the second one is obtained by observing that, because b √ N N N of the low-SNR hypothesis: σN,SCM . We therefore get, for a fixed ρ, r r 1 1 1 1 a.s. P T SCM > H Q g p , a.s. y∗RN− (ρ)y x∗RN− (ρ)x 0 N 1 1 SCM ( ) 0 √N | − σN,SCM −→ N − N −→     and then using theb well-known convergenceb result [42]: The generalizationb of this result to uniform convergence across SCM ρ κ can be derived along the same steps as in [28]. We 1 1 1 a.s. ∈ R x∗R− (ρ)x tr CN QN (ρ) 0. now provide details about the control of the lim sup Υ and N N − Nρ −→ N lim sup ωN . The fact that lim sup ωN < follows directly We will now dealb with the fluctuations of the numerator. We from the last item in Assumption 1, while∞ the control of have: lim sup ΥN < requires one to check that: ∞ 1 1 ap∗ 1 1 p 1 √N y∗R− p = R− p+x∗R− . 2 2 2 2 N N N N N √ lim inf 1 cm( ρ) (1 ρ) tr CN QN (ρ). N − − − N

b b b

11

The proof hinges on the observation that this term naturally Proof: The proof relies on the Talor expansion of hN and appears when computing the derivative of m(z) with respect gN in the vicinity of 1, which asserts that for any ρ [0, 1] ∈ to z at z = ρ. Simple calculations reveal that: there exist ξ1 and ξ2 satisfying: − ′ ′′ 2 2 c(1 ρ) − hN (ρ)= hN (1)(ρ 1)+(ρ 1) hN (ξ1) ′ − − ′′ m′(z)= z+ − tr CN QN (z) 2 − N gN (ρ)= gN (1)(ρ 1)+(ρ 1) gN (ξ2)   1 − − 1 − 1 m(z)2(1 ρ)2 tr C2 Q2 (z) . × − − N N N   We therefore have, ′ ′ ′′ ′ It suffices thus to show that m′( ρ) is bounded. As m is a − hN (ρ) hN (1) hN (1)+(ρ 1)hN(ξ1) hN (1) Stieltjes transform of some positive probability measure µ, it lim sup ′ = lim sup ′ − ′′ ′ N gN (ρ) − gN (1) N gN (1)+(ρ 1)gN(ξ2) − gN (1) can be written as: − ′′ ′ ′ ′′ µ(dx) 1 (ρ 1)hN (ξ1)gN (1) (ρ 1)hN (1) gN (ξ2) m′( ρ)= = − ′ ′ − − ′′ − (x+ρ)2 ≤ κ2 gN (1) gN (1)+(ρ 1)gN(ξ2) − Z ′′ ′ ′ ′′ which ends the proof. lim supN hN (ξ1)gN (1) +lim supN h N (ξ1)gN (ξ2) ρ 1 | | ′ | | ≤ | − | 2 lim inf gN (1) APPENDIX B Tending ρ to 1 establishes the desired result. PROOF OF PROPOSITION 4 Obviously functions hˆ(ρ) and gˆ(ρ) satisfiy the assump- ˆ tions of Lemma 13. Applying l’Hopital’s rule and using the For ease of notation, we denote by f(ρ) and f(ρ), the d 1 2 ˆ ˆ differentiation rules RN− (ρ) = RN− (ρ) RN +I and quantities fSCM (ρ) and fSCM (ρ). It is easy to see that f(ρ) dρ − − and f(ρ) converges to an undetermined form as ρ 1. Set d 2 3   dρ RN− (ρ)= 2RN− (ρ) RN +I , we finally prove: hˆ ρ ↑ − b − b b fˆ(ρ) , ( ) with gˆ and hˆ being given by: gˆ(ρ)  N b limblim sup fˆ(ρb) =0. (14) 2 2 ρ 1 c 1 1 N − p RN p gˆ(ρ)=(1 ρ) p∗R− (ρ)p (1 ρ) 1 c+ ρ tr R− (ρ) ↑ ∗ N N N − − − 1 1 a.s. 1 2   Now, using the fact p∗RN p p∗CN p 0 in conjunc- hˆ(ρ)= p∗R− (ρ)p ρp∗R− (ρ)p N − N b −→ N b− N b tion to the last item in Assumption 1, we get: b The handling of the values of ρ approaching 1 can be per- N . . b b lim lim sup fˆ(ρ) a s 0. (15) formed using the l’Hopital’s rule. ρ 1 − p C p −→ ↑ N ∗ N The idea of the proof is to treat seperately the values of ρ in On the other hand, a careful analysis of the behaviour of f(ρ) the interval [κ, 1 κ] and those in [1 κ, 1] for some κ small enough. In order− to allow for a setting− of κ that is independent near 1 reveals similarly that: from N, we need to prove that: N lim lim sup f(ρ) 0 (16) ′ ρ 1 − p C p → ↑ N ∗ N hN (1) lim lim sup fˆ(ρ) ′ =0. (13) Combining (15) with (16), we finally obtain: ρ 1 N − gN (1) ↑ lim lim sup fˆ(ρ) f(ρ) 0 To this end, a uniform variant of the l’Hopital’s rule is ρ 1 N − → ↑ essential. This variant is stated in the following Lemma: , It then suffices to prove Proposition 4 on κ [κ, 1 ℓ]. hN (ρ) R − Lemma 13. Let fN (ρ) = with hN and gN being de- To this end, we need to recall the following relations satisfied gN (ρ) fined in the interval ρ [0, 1]. Assume that hN (1) = gN (1) = by mN ( ρ): ∈ − dgN dgN 1 c c 1 0 while lim infN dρ > 0, lim supN dρ < + ρ=1 ρ=1 ∞ mN ( ρ)= − + tr QN (ρ) − ρ ρ N dhN and lim supN dρ < + . Assume also that the second ρ=1 ∞ and derivatives of h and g are uniformly bounded in N, that N N 1 is: 1 − mN ( ρ)= ρ+c(1 ρ) tr CN QN (ρ) − − N ′′   sup lim sup hN (ρ) < + ρ [0,1] N ∞ Combining these relations, we therefore get: ∈ ′′ ρ 1 Q ρ sup lim sup g (ρ) < + 1 1 N tr N ( ) N tr CN QN (ρ)= − ρ [0,1] N ∞ N (1 c)(1 ρ)(1 c+ c tr Q (ρ)) ∈ N  N − − − Then, The result thus follows by using Proposition 2 and noticing, ′ in the same way as in [28], that: hN (ρ) hN (1) lim lim sup ′ 0. ρ 1 g (ρ) g → N N − N (1) → 1 1 1 a.s. sup tr QN tr R− (ρ) 0, N − ρ N −→ ρ [κ,1 ℓ] ∈ − b

12

p∗ 1 1 a.s. [24] E. Conte, M. Lops, and G. Ricci, “Asymptotically Optimum Radar De- sup QN p p∗R− (ρ)p 0 √ − √ N −→ tection in Compound-Gaussian Clutter,” IEEE Trans. Aerosp. Electron. ρ [κ,1 ℓ] N ρ N ∈ − Syst., pp. 617–625, Apr. 1995. and b [25] J. Liu, Z-J. Zhang, Y. Yang, and H. Liu, “A CFAR Adaptive Subspace Detector for First-Order or Second-Order Gaussian Based on a 1 2 p∗CN Q (ρ)p Single Observation,” IEEE Trans. Signal Process., vol. 59, no. 11, pp. sup N N (17) 2 2 1 2 2 5126–5140, 2011. ρ [κ,1 ℓ] 1 cm( ρ) (1 ρ) tr CN QN (ρ) [26] S. Kraut and L.L. Scharf, “The CFAR adaptive subspace detector is a ∈ − − − − N 1 1 2 scale-invariant GLRT,” IEEE Trans. Signal Process., vol. 47, no. 9, pp. p R− p ρp R− p N ∗ N ∗ N a.s. 2538–2541, 1999. − 0. (18) [27] F Pascal, J-P Ovarlez, P Forster, and P Larzabal, “Constant false alarm −  1 ρ 1 R 1  −→ rate detection in spherically invariant random processes,” in Proc. of the (1 ρ) b−c +c N tr b N− (ρ) − European Conf., EUSIPCO-04, Vienna, Sept. 2004,

  pp. 2143–2146. b [28] R. Couillet, A. Kammoun, and F. Pascal, “Second Order Statistics of Robust Estimators of Scatter. Application to GLRT Detection for REFERENCES Elliptical Signals,” Submitted to Journal of Multivariate Analysis, 2014, http://arxiv.org/abs/1410.0817. [1] J. Ward, “Space-time Adaptive Processing for Ariborne Radar,” Tech. [29] R. Couillet and M. McKay, “Large Dimensional Analysis and Opti- Rep., Lincoln Lab, MIT, Dec. 1994. mization of Robust Shrinkage Covariance Matrix Estimators,” Journal [2] R. Klemm, Principles of Space-Time Adaptive Processing, IET, 2002. of Multivariate Analysis, vol. 31, pp. 99–120, 2014. [3] E. Olilla, D. E. Tyler, V. Koivunen, and H. V. Poor, “Complex Elliptically [30] L. L. Scharf and B. Friedlander, “Matched Subspace Detectors,” IEEE Symmetric Distributions: Survey, New Results and Applications,” IEEE Trans. Signal Process., vol. 42, no. 8, pp. 2146–2157, 1994. Trans. Signal Process., vol. 60, no. 11, Nov. 2012. [31] E. Conte, A. D. Maio, and G. Ricci, “Recursive Estimation of the [4] M. Mahot, F. Pascal, P. Forster, and J. P. Ovalez, “Asymptotic Properties Covariance Matrix of a Compound-Gaussian Process and Its Application of Robust Complex Covariance Matrix Estimates,” IEEE Trans. Signal to Adaptive CFAR Detection,” IEEE Trans. Signal Process., vol. 50, no. Process., vol. 61, no. 13, pp. 3348–3356, July 2013. 8, Aug. 2002. [5] Y. Abramovich, “Controlled Method for Adaptive Optimization of [32] F. Gini and M. Greco, “Covariance Matrix Estimation for CFAR Filters Using the Criterion of Maximum SNR,” Radio Eng. Electron. Detection in Correlated Heavy-Tailed Clutter,” IEEE Trans. Signal Phys, vol. 26, no. 3, pp. 87–95, 1981. Process., vol. 82, no. 12, pp. 1847–1859, Dec. 2002. [6] B. D. Carlson, “Covariance Matrix Estimation Errors and Diagonal [33] I. Reed, J. Mallett, and L. Brennan, “Rapid Convergence Rate in Loading in Adaptive Arrays,” IEEE Trans. Aerosp. Electron. Syst., vol. Adaptive Arrays,” IEEE Trans. Aerosp. Electron. Syst., vol. 10, no. 24, no. 4, pp. 397–401, 1988. 6, pp. 853–863, Nov. 1974. [7] D. Kelker, “Distribution Theory of Spherical Distributions and a [34] F. Pascal and P. Forster and J. P. Ovarlez and P. Larzabal, “Performance Location Scale Parameter Generalization,” Sankhy¯a: The Indian Journal Analysis of Covariance Matrix Estimate in Impulsive Noise,” IEEE of Statistics, Series A, vol. 32, no. 4, pp. 419–430, Dec. 1970. Trans. Signal Process., vol. 56, no. 6, 2008. [8] P. J. Huber, “Robust estimation of a location parameter,” The Annals [35] S. Wagner, R. Couillet, M. Debbah, and D. T. M. Slock, “Large system of Mathematical Statistics, vol. 35, no. 1, pp. 73–101, 1964. analysis of linear Precoding in MISO broadcast channels with limited [9] P. J. Huber, “The 1972 wald lecture robust statistics : A review,” Annals feedback,” IEEE Trans. Inf. Theory, vol. 58, no. 7, pp. 4509–4537, July of Mathematical Statistics, vol. 43, no. 4, pp. 1041–1067, August 1972. 2012. [10] R. A. Maronna, “Robust M-estimators of multivariate location and [36] J. W. Silverstein and Z. D. Bai, “On the empirical distribution of scatter,” Annals of Statistics, vol. 4, no. 1, pp. 51–67, January 1976. eigenvalues of a class of large dimensional random matrices,” Journal [11] F. Pascal, Y. Chitour, J.P. Ovarlez, P. Forster, and P. Larzabal, “Co- of Multivariate Analysis, vol. 54, no. 2, pp. 175–192, 1995. variance structure maximum-likelihood estimates in compound Gaussian [37] R. Couillet, F. Pascal, and J. W. Silverstein, “The random matrix regime noise: existence and algorithm analysis,” IEEE Trans. Signal Process., of Maronna’s M-estimator with elliptically distributed samples,” (to vol. 56, no. 1, pp. 34–48, Jan. 2008. appear in) Elsevier JMVA, 2014, http://arxiv.org/abs/1311.7034. [12] D. E. Tyler, “A Distribution-Free M-estimator of Multivariate Scatter,” [38] R. Couillet, F. Pascal, and J. W. Silverstein, “Robust Estimates of The Annals of Statistics, vol. 15, no. 1, pp. 234–251, 1987. Covariance Matrices in the Large Dimensional Regime,” IEEE Trans. [13] Y. Chitour and F. Pascal, “Exact maximum likelihood estimates for Inf. Theory, vol. 60, no. 11, pp. 7269–7278, Oct. 2014. SIRV covariance matrix: existence and algorithm analysis,” IEEE Trans. [39] A. Kammoun, M. Kharouf, W. Hachem, and J. Najim, “A Central Limit Signal Process., vol. 56, no. 10, pp. 4563–4573, Oct. 2008. Theorem for the SINR at the LMMSE Estimator Output for Large- [14] F. Pascal, Y. Chitour, and Y. Quek, “Generalized Robust Shrinkage Dimensional Signals,” IEEE Trans. Inf. Theory, vol. 55, no. 11, pp. Esitmator and Its Application to STAP Detection Problem ,” IEEE Trans. 5048–5063, Nov. 2009. Signal Process., vol. 62, no. 21, pp. 5640–5651, 2014. [40] F. Pascal and J. P. Ovarlez, “Asymptotic Properties of the Robust [15] E. Ollila and D. E. Tyler, “Regularized M-estimators of Scatter Matrix,” ANMF,” ICASSP, 2015. IEEE Trans. Signal Process., vol. 62, no. 22, Nov. 2014. [41] W. Hachem and P. Loubaton and J. Najim and P. Vallet, “On bilinear [16] Y. Sun, P. Babu, and D. P. Palomar, “Regularized Tyler’s Scatter forms based on the resolvent of large random matrices,” Annales de Estimator: Existence, Uniqueness and Algorithms,” IEEE Trans. Signal l’Institut Henri Poincar´e- Probabilit´es et Statistiques, vol. 49, no. 1, Process., vol. 62, no. 19, pp. 5143–5156, Oct. 2014. pp. 36–63, 2013. [17] K D Ward, “Compound representation of high resolution sea clutter,” [42] W. Hachem, Ph. Loubaton, and J. Najim, “Deterministic Equivalents Electronics Letters, vol. 17, no. 16, pp. 561–563, August 1981. for Certain Functionals of Large Random Matrices,” Annals of Applied [18] S Watts, “Radar detection prediction in sea clutter using the compound Probability, vol. 17, no. 3, pp. 875–930, 2007. k-distribution model,” IEE Proceeding, Part. F, vol. 132, no. 7, pp. 613–620, December 1985. [19] T Nohara and S Haykin, “Canada east coast trials and the k-distribution,” IEE Proceeding, Part. F, vol. 138, no. 2, pp. 82–88, 1991. [20] J B Billingsley, “Ground clutter measurements for surface-sited radar,” Tech. Rep. 780, MIT, February 1993. [21] O. Ledoit and M. Wolf, “A Well-Conditioned Estimator for Large- Dimensional Covariance Matrices,” Journal of Multivariate Analysis, vol. 88, no. 2, pp. 365–411, 2004. [22] Y. Chen, A. Wiesel, Y. C. Eldar, and A. O. Hero, “Shrinkage Algorithms for MMSE Covariance Estimation,” IEEE Trans. Signal Process., vol. 58, no. 10, Oct. 2010. [23] Y. Chen, A. Wiesel, and O. A. Hero, “Robust Shrinkage Estimation of High Dimensional Covariance Matrices,” IEEE Trans. Signal Process., vol. 59, no. 9, pp. 4907–4107, 2011.