arXiv:1209.0897v2 [stat.AP] 6 Nov 2012 hoyi h ttsia omnt hs atdcds[4], decades last so-called the these solutions, problems, several community Among these statistical [6]. estima overcome [5], the robust presence To in in activity the [3]. research theory in intense in an and been shown contexts has there in as noise this Indeed, impulsive performance outliers instance degraded. in of the for strongly case be [2], as the can is processing non-Gaussian, SCM the sonar be by involved and to out radar Gaussian adaptive turn for When drawbacks. data estimator major from the (ML) suffers SCM which Likelihood the However, (SCM) data. Maximum Matrix estimator the Covariance used Sample often is most well-known The the matrix. is covariance data the of isn -43 ahn rne(e-mail:philippe.forster France Cachan, F-94230 Wilson, frederic.pascal (e-mail: France Gif-sur-Yvette, F-91190 melanie.mahot@s (e-mail: France Gif-sur-Yvette, [1 F-91190 clutter impulsive with radar may adaptive as in K-distribution, instance the for of met means by distributions tailed St multivariate the for or (or as distribution, distributions Gaussian well-known the of instance number distributio large Kelke a elliptical by encompass been introduced of have originally framework distributions, They Elliptical the SCM. classical within the an the introduced in as themselves to investigated imposed alternative have and [8], appealing Maronna [7] of Huber work seminal by introduced originally aasa,(e-mail:[email protected]) Palaiseau, in litcldsrbtos Complex distributions, elliptical tion, loih n dpierdrdtcinbsdo h ANMF the on based test. Classification Filter) detection arriva SIgnal Matched radar of Normalized (MUltiple of (Adaptive adaptive (directions improvement MUSIC and DOA the the algorithm applications: show using we two hand, estimation on M-estimat other results the the of such 0 asympto On degree the derived. of function precisely, d are asymptotic homogeneous More the any as [1]. of well tion in as estimators Tyler these of of distribution t results to extends the paper case this Firstly, distributions. elliptical plctosda ihcmlxdt.Ti ae oue on focuses env non-Gaussian paper the in particularly, This problems and data. ronments estimation complex matrix Generally with covariance performance. resulting deal the int applications to of linked parameters strongly and parameters is nuisance of estimation the smttcpoete frbs ope covariance complex robust of properties Asymptotic .Frtri ihSTE N ahn NS nvru,6,Av 61, UniverSud, CNRS, Cachan, ENS SATIE, with is Jo rue Forster P. 3 Moulon, du Plateau Supelec, SONDRA, with is Pascal F. .P vre swt NR,DM/S,Cei el Hunière, la de Chemin DEMR/TSI, ONERA, with is Ovarlez J.-P. Jol rue 3 Moulon, du Plateau Supelec, SONDRA, with is M.Mahot aysga rcsigapiain eur h knowledge the require applications processing signal Many ne Terms Index Abstract éai Mahot Mélanie t itiuin hymyas eue omdlheavy model to used be also may They distribution. ) I aysaitclsga rcsigapplications, processing signal statistical many —In Cvrac arxetmto,rbs estima- robust estimation, matrix —Covariance tdn,IEEE Student, .I I. NTRODUCTION M etmtr ntecnetof context the in -estimators rdrcPascal Frédéric , M -estimators. arxestimates matrix @supelec.fr) @satie.ens-cachan.fr) upelec.fr) M Ovarlez -estimators ecomplex he liot-Curie, n[9], in r iot-Curie, istribu- F-91120 ebr IEEE Member, uPdt du . these , udent erest tion ns. 0], tic be ebr IEEE Member, es l) i- ) etr rs.upraeletters). lowerca uppercase bold-faced (resp. by denoted letters are a matrices) analysis (resp. work. Vectors theoretical this concludes the VI validate Eventually Section simulations distribution. V, asymptotic Section estimators the of properties about th known introduces the II III real Section Section and follows. beamforming. background as MVDR required organized is for It is [16], [23]. paper in [22], This Ollila Scharf by and illustrated th Kraut also using Filter by Matched effect introduced Normalized test this Adaptive (ANMF) illustrate the We and [3]. method as MUSIC in SCM instance the for with damaged shown completely possibly and unreliable performa the outliers, with contains obtained or non-Gaussian is context h aai ftesm re stedmnino h sample the of dimension the as of order dimension same the the where [20]. of datasets, is large data of the case the The in processing. applicat studied several [16 array in [15], as use Ollila their such advocates by who papers [19] recent [18], [17], the are t the exceptions alternative Concerning an notable applications. Fix a as radar the used for Only widely SCM called the been community. also has [13] [14] processing Estimator estimator Point Tyler’s signal the the case, limited in used seldom [12]. [11], et h smttcdsrbto faypstv homogene co degree positive distributions any of in elliptical of functional Tyler complex distribution by asymptotic the derived the in extend initially text, that also property show will a we We case, [1]: proof. complex without the but res to [15] This in data. to distributed provided is elliptically also complex paper of this framework of the of distribution in purpose application asymptotic the processing the and signal derive complex in usually However, [21] are Tyler case. data by real studied been the have in They SCM context. the Gaussian of the distribution proces Wishart in signal the the to opposed in as well-known community, not are properties tistical ihafwmr aa(eedn on an (depending with data or more SCM few square the a mean with with same estimated the ex- when has for error estimated factor, parameter scale the a r detection, adaptive to or up estimation (DOA) matrix Direction-of-Arrival ample covariance wh the useful applications need fo processing result, Thus, only signal This [15]. for factor. in and scale proposed context a Gaussian one to the extends up applications, same for the is SCM, the n osberao o hslc fitrs sta hi st their that is interest of lack this for reason possible One M hlpeForster Philippe , etmtr.Te eto Vpoie u contribution our provides IV Section Then -estimators. M etmtr ftecvrac arxaehowever are matrix covariance the of -estimators M etmtr ssacl nune hl tis it while influenced scarcely is -estimators 0 fetmtssc as such estimates of ebr IEEE Member, M etmtr aeas been also have -estimators ∗ , σ T 1 .Mroe,we the when Moreover, ). and Jean-Philippe , M H etmtsand -estimates M M M respectively -estimators, -estimators -estimator l is ult adar ions sing in , nce ous ich a r nd ed n- se a- s, ], o e e 1 2 represent the conjugate, the transpose and the Hermitian C. M-estimators of the scatter matrix d operator. means "distributed as", = stands for "shares the Let (z1, ..., z ) be an N-sample of m-dimensional real ∼ N same distribution as", d denotes convergence in distribution (resp. complex circular) independent vectors with z → i ∼ and denotes the Kronecker product. vec is the operator (0 1, Λ,hz) (resp. z (0 1, Λ,hz)), i = 1, ..., N. ⊗ m, i m, which transforms a matrix m n into a vector of lenth mn, EThe real (resp. complex) M∼-estimator CE of Λ is defined as the concatenating its n columns into× a single column. Moreover, solution of the following equation I 0 m is the m m identity matrix, m,p the m p matrix of N × m × 1 M = u z′ M−1z z z′ . (2) zeros, Jm2 = Jii Jii where Jii is the m m matrix N n n n n ⊗ × n=1 i X   with a one inX the position and zeros elsewhere and (i,i) where the symbolc ′ stands for T cin the real case and for H in K is the commutation matrix which transforms vec A into ( ) the complex one. vec(AT ). Eventually, Im(y) represents the imaginary part of M-estimators have first been studied in the real case, defined the complex vector y and Re(y) its real part. as solution of (2) with real samples. Existence and uniqueness of the solution of (2) has been shown in the real case, provided II. BACKGROUND function u satisfies a set of general assumptions stated by A. Elliptical symmetric distribution Maronna in [8]. These conditions have been extended to the complex case by Ollila in [17]. They are recalled here below Let z be a -dimensional real (resp. complex circular) ran- m in the case where µ = 0m,1: dom vector. The vector z has a real (resp. complex) elliptical - u is non-negative, non increasing, and continuous on symmetric distribution if its probability density function (PDF) [0, ). can be written as ∞ - Let ψ(s)= su(s) and K = sups≥0 ψ(s). m 0 such that for every m hyperplane S, dim(S) m 1, PN (S) 1 K a. where hz : [0, ) [0, ) is any function such that (1) ≤ − ≤ − − ∞ → ∞ This assumption can be strongly relaxed as shown in defines a PDF, µ is the statistical mean and Λ is a scatter [25], [26]. matrix. The scatter matrix Λ reflects the structure of the Let us now consider the following equation, which is roughly of z, i.e. the covariance matrix is equal to speaking the limit of (2) when N tends to infinity: Λ up to a scale factor. This real (resp. complex) elliptically 1 symmetric distribution will be denoted by (µ, Λ,hz) (resp. M = E u(z′M− z) zz′ , (3) E (µ, Λ,hz)). One can notice that the Gaussian distribution isCE a particular case of elliptical distributions. A survey on where z (0m,1, Λ,hz)(resp. (0m,1,Λ,hz)) and where ∼E′ T CE H complex elliptical distributions can be found in [15]. the symbol stands for in the real case and for in the complex one. In this paper, we will assume that µ = 0m,1. Without loss of generality, the scatter matrix will be taken to be equal to Then, under the above conditions, it has been shown for the the covariance matrix when the latter exists. Indeed, when the real case in [26], [8] that: second moment of the distribution is finite, function hz in - Equation (3) (resp. (2)) admits a unique solution M (resp. (1) can always be defined such that this equality holds. If the M) and distribution of the data has a none finite second-order moment, M = σ−1Λ, (4) then we will simply consider the scatter matrix estimator. c where σ is the solution of E[ψ(σ t 2)] = m, | | where t (0 1, I ,hz), see e.g. [6] (resp. t ∼ E m, m ∼ B. Generalized Complex Normal distribution (0 1, I ,hz)). CE m, m As written before, the Gaussian distribution is a particular - A simple iterative procedure provides M. case of elliptical symmetric distributions. However, in the - M is a consistent estimate of M. complex framework, it is true only for circular Gaussian The extension to the complex case of previousc results has been random vectors. We now present the generalization of this donec in [15]. distribution as presented by Van den Bos in [24]. Let z = x+jy be a m-dimensional complex random vector. D. The vector z is said to have a generalized complex normal distribution if and only if v = (xT , yT )T R2m has a normal The real (resp.complex) Wishart distribution (N, Λ) ∈ N W distribution. This generalized complex normal distribution will (resp. (N, Λ)) is the distribution of z z′ , where be denoted by µ Σ Ω where µ is the mean, Σ CW n n ( , , ) = n=1 GCNH E[(z µ)(z µ) ] the covariance matrix, and Ω = E[(z zn are real (resp. complex circular), independentX identically µ)(z − µ)T ] −the pseudo-covariance matrix. − distributed (i.i.d), Gaussian with zero mean and covariance − 3

N −1 ′ M-estimators verify µ1 = σ1 and µ2 = σ2 (equation matrix Λ. Let WN = N znzn be the related SCM (6)), Tyler’s theorem shows that √N(H(WN ) H(Λ)) n=1 − which will be also referredX to, as a Wishart matrix. The and N/σ1(H(M) H(Λ)) share the same asymptotic − asymptotic distribution of the Wishart matrix W is (see e.g. distribution. N p [27]) c In practice, may be a function which associates d H(.) √Nvec(WN Λ) 0m2,1, (Λ Λ)(Im2 + K) a parameter of interest to a covariance matrix. This − −→ N ⊗ in the real case, scale-invariant property has also been exploited in [?]. d T T  √Nvec(WN Λ) 0m2,1, Λ Λ, (Λ Λ)K The concerned signal processing applications are those − −→ GCN ⊗ ⊗ in which multiplying the covariance matrix by a positive in the complex case.   (5) scalar does not change the result. This is the case for We now introduce real M-estimators asymptotic properties instance for the MUSIC method in which the estimated since they are used as a basis for the extension to the complex parameters are the signals DOA. An other example is given case. by adaptive radar processing in which the parameter is the ANMF test statistic [22], [23]. Here, H is defined by: pH M−1y 2 III. REAL M -ESTIMATORS PROPERTIES M H H(M)= | | . −1 −1 A. Asymptotic distribution of the real M-estimators → (pH M y)(yH M y) c Let M be a real M-estimator following Maronnas’s condi- cThe aimc of the next section is to extend those results to c c tions [8], recalled in section II-C. The asymptotic distribution the complex case, which is the frequently met framework for of M isc given by Tyler in [21]: most signal processing applications. d √Nvec(M M) (0 1, Π) , (6) IV. MAINRESULTSINCOMPLEXCASE c − −→ N m, T A. Asymptotic distribution of the complex M-estimator where Π = σ1(I 2 + K)(M M)+ σ2vec(M)vec(M) , σ1 m c ⊗ and σ2 are given by ([21]): Let (z1, ..., zN ) be an N-sample of m-dimensional complex independent vectors with z (0 1, Λ,hz), n =1, ..., N. n ∼ CE m, 2 −2 We consider the complex -estimator M which verifies σ1 = a1(m + 2) (2a2 + m) , M C equation (2), and we denote MC the solution of (3). 2a1(a2 1)  −2 σ2 = a2 (a1 1) − 2 [m + (m + 4)a2] , c  − − (2a2 + m)  " # Theorem IV.1 The asymptotic distribution of MC is given by (7)  d with √Nvec(MC MC ) 0 2 1, Σ, Ω , (10) −1 2 2 m , c a1 = [m(m + 2)] E ψ (σ t ) , − −→ GCN −1 2 ′ 2 | | where Σ and Ω are defined by  a2 = m E[σ t ψ (σ t )], c  | | | |  T H and σ is given in equation (4). Σ = σ1MC MC + σ2vec(MC )vec(MC ) , T⊗ T Ω = σ1(MC MC )K + σ2vec(MC )vec(MC ) , ⊗ (11) B. An important property of real M-estimators with 2 −2 Let V be a fixed symmetric positive-definite matrix and VN σ1 = a1(m + 1) (a2 + m) , a sequence of symmetric positive definite random matrices of 2a1(a2 1)  −2 order m which satisfies σ2 = a2 (a1 1) − 2 [2m + (2m + 4)a2] ,  − − (2a2 +2m) d " # √Nvec(VN V) 0m2,1, S , (8) (12) − −→ N and T −1 2 2 where S = µ1(Im2 + K)(V V)+ µ2vec(V)vec(V) , µ1 a1 = [m(m + 1)] E ψ (σ t ) , ⊗ −1 2 ′ 2 | | and µ2 are any real numbers such that S is a positive matrix. a2 = m E[σ t ψ (σ t )], Let H(V) be a r-dimensional multivariate function on the  | | | |  where σ is the solution of E[ψ(σ t 2)] = m, where t set of m m positive-definite symmetric matrices with con- | | ∼ × (0 1, I ,hz). tinuous first partial derivatives and such as H(V)= H(αV) CE m, m for all α> 0. Then under conditions (8), Tyler has shown in This result is also given in [15] with others assumptions but [1] Theorem 1, that without proof.

√N (H(V ) H(V)) d N B. Proof of Theorem IV.1 − −→ ′ ′ T 0 1, 2µ1H (V)(V V)H (V) , (9) N r, ⊗ 1) Notations: Let us first introduce the following linear V  one-to-one transformation of a Hermitian m m matrix A ′ 1 dH( ) × where H (V)= (Im2 + Jm2 ). into a real symmetric 2m 2m matrix: 2 dvec(V)! × By noticing that, in a Gaussian context the SCM 1 Re(A) Im(A) f(A)= − . (13) satisfies µ1 = 1 and µ2 = 0 (equation (5)) and that real 2 Im(A) Re(A)   4

The inverse transformation is given by A = gH f(A)g where 3) End of proof of theorem IV.1: By using equation (15) gT = (I , jI ). Function f has some useful properties. and the inverse of f, one obtains M = gH M g. From the m − m C R Let un and vn be the following 2m vectors: Lemma IV.1 vec(MR). has a normal distribution. It follows u z T z T T that vec(MC) has a generalized complexc normalc distribution. n = (Re( n) ,Im( n) ) T T T T (14) Given the propertyc vec(ABC) = (C A)vec(B) where vn = ( Im(zn) ,Re(zn) ) , ⊗ − A, B, Ccare 3 matrices, and using the fact that MC = which are both distributed according to (02 1, Λ ,gz) H E m, R g MRg, one has where ΛR = f(Λ). c Σ = NE[vec(M M )vec(M M )H ] Then, it may be shown that c C − C C − C = (gT gH ) E[Nvec(M M )vec(M M )H ] 1 ⊗ R − R R − R A−1 A −1 (gT gHc)H . c f( )= f( ) ⊗ 4 c c (20) 1 H T T Using lemma IV.1, and the equalities (18) and (19), equation f(znzn )= (unun + vnvn ) 2 (20) gives H −1 1 T −1 1 T −1 zn A zn = un f(A) un = vn f(A) vn 2 2 1 T Σ = (gT gH )NE vec (M + TrM Tr ) M ⊗ 2 u u − u 0m,m Im " ! Let Tr = − , one has vn = Trun and un = H Im 0m,m 1 c c T   T T H H Tr v . vec (Mu + TrMuTr ) Mu (g g ) n 2 −  ⊗ Let us also introduce ! T H H = (g gc)(I4m2 +cTr Tr)Πu(I4m2+ Tr Tr) M = f(M ), R C (15) (gT ⊗ gH )H ⊗ ⊗ MR = f(MC). T ⊗ H T H H = (g g ) Πu (g g ) , It is easy to show thatc equationc (2) defining the complex ⊗ ⊗ (21) M-estimator MC , is equivalent to the following equation by using Πu the asymptotic covariance of Mu and the involving M : equalities gT Tr = jgT and gH Tr = jgH . R − c Using the expression given in (6), and taking into account N c1 T −1 T that the un are 2m-dimensional vectors, we have MR = ur u M un unu + 2N n R n T n=1 Πu = σ1(I4m2 + K)(Mu Mu)+ σ2vec(Mu)vec(Mu) X    ⊗ c c T −1 T (22) ur vn MR vn vnvn , (16) where σ1 and σ2 will be specified later.    A consequence of lemma IV.1 is that M = M . Indeed, where ur(s) = u(s/2). Roughlyc speaking, equation (16) R u defines a real M-estimator involving the 2N real samples un from the definition of MR, one has and vn. 1 T −1 T T −1 T MR = E ur u Mc u uu + E ur v M v vv Let Mu and Mv be respectively the two M-estimators 2 R R defined by The first term of the right hand side is the definition of Mu c c N while the second one is the one of Mv. Then, as Mu = Mv, 1 T −1 T Mu = ur u M un unu , one has M M . N n u n R = u n=1 Therefore, M = gH M g, which leads to XN   (17) C u c 1 T c−1 T T H M = u v M v v v , Σ = σ1(M M )+ σ2vec(M )vec(M ) . (23) v N r n v n n n C C C C n=1 ⊗ X   Now let us turn to the σ1 and σ2 coefficients. Using (6), c c and let Mu, Mv be the associated solutions of one has T −1 T 2 −2 Mu = E ur u1 Mu u1 u1u1 , σ1 = a1(2m + 2) (2a2 +2m) , T −1 T Mv = E ur v1 Mv v1 v1v1 . 2a1(a2 1)     −2 σ2 = a2 (a1 1) − 2 [2m + (2m + 4)a2] , By applying Tr on equation (17), one obtains  " − − (2a2 +2m) #  T  M TrM Tr  v = u . (18)  −1 2 2 a1 = [2m(2m + 2)] E ψr (σ s ) , −1 2 ′ 2 | | Moreover, since vn has the same distribution as un,  a2 = (2m) E[σ s ψ (σ s )], c c  | | r  | |  T  (24) Mu = Mv = TrMuTr . (19)  where ψ (s) = su (s), s (02 1, I2 ,hz) and σ is the r r ∼ E m, m 2) An intermediate result: solution of 2 E[ψr(σ s )]=2m. (25) 1 | | Lemma IV.1 M and (M +M ) have the same Gaussian Since ψr(s)=2ψ(s/2), equation (25) is equivalent to R 2 u v asymptotic distribution. σ c c c E ψ s 2 = m. (26) Proof: See appendix A. " 2| | !# 5

2 ′ Moreover let t (0m,1, Im,hz). Then t has the same H (M) which can be seen as a gradient of H, is orthogonal distribution as s∼2/ CE2 so that (25) and (26) are| | also equivalent to vec(M). | | to A first order approximation of H(M) gives E ψ(σ t 2) = m (27) | | H(M) H(M)+ H′(M)vec(M M), (35) ≃ c − We finally obtain the expression of Σ. Thus one has, a) Asymptotic pseudo-covariance matrix: Ω is defined c c H as Σ = NE H(M) H(M) H(M) H(M) H − − T   Ω = NE[vec(MC MC)vec(MC MC) ] (28) ′   H  ′ H − − = H (M)E cNvec(M M)vec(cM M) H (M) ′ ′ H− − Using the commutation matrix K, one has = H (M)ΣMh H (M) i c c ′ T H ′ H = H (M) ν1M cM + ν2vec(Mc)vec(M) H (M) Kvec(M M ) = vec(MT MT ) C C C C ′ M MT ⊗ M ′ M H − − ∗ (29) = H ( ) ν1 H ( ) . = vec(MC MC) ⊗  (36) − c c The proof is similar for Ω .  since M is Hermitian. Thus one can write H C c Similarly to the real case, when the data have a complex vec(M M )T = [vec(M M )∗]H , Gaussian distribution, the SCM is a complex Wishart matrix. c C C C C (30) − − H Moreover, the SCM estimator verifies the conditions of the = vec(MC MC ) K, c c − theorem and its coefficients (µ1,µ2) are equal to (1, 0). where KH = K. c Complex normalized M-estimators also verify the conditions Therefore, Ω = ΣK, which leads to the result of theorem of the theorem with (µ1,µ2) = (σ1, σ2). Thus they have IV.1 after a few derivations, and concludes the proof. the same asymptotic distribution as the complex normalized In the following part, we extend the result of section III-B Wishart matrix, up to a scale factor σ1 depending on the to the complex case. considered M-estimator. The same conclusion holds for the Fixed Point Estimator [13], [?] since it verifies the C. An important property of complex M-estimators assumptions of theorem IV.2 (see [28] for its asymptotic Theorem IV.2 distribution). • Let M be a fixed Hermitian positive-definite matrix and M a sequence of Hermitian positive-definite random V. SIMULATIONS matrix estimates of order which satisfies m The results of this paper are illustrated using the complex c d analogue of Huber’s -estimator as described in [17]. The √N vec(M M) 0 2 1, Σ , Ω , M − −→ GCN m , M M (31) corresponding weight function u(.) of equation (2) is defined    with c by 1 T H 2 ΣM = ν1M M + ν2vec(M)vec(M) , u(s)= min(1, k /s), (37) T⊗ T (32) β ΩM = ν1(M M)K + ν2vec(M)vec(M) , ⊗ where k2 and β depend on a single parameter 0 0. distribution with m degrees of freedom. Thus Huber estimate Then, is the solution of d N √N H(M) H(M) (0r,1, ΣH , ΩH ) , (33) 1 H − −→ GCN MHub = zizi ½zH Mc−1 z 2 Nβ i Hub i≤k   i=1 where ΣH and ΩH are defined as X h c H ′ T ′ H c 2 zizi

ΣH = ν1H (M)(M M)H (M) , +k ½ H c−1 2 , (40) 1 z M zi>k ′ T ⊗ ′ T (34) H − i Hub Ω = ν1H (M)(M M)KH (M) , zi MHubzi # H ⊗ M which can be rewritten ′ dH( ) ′ ′ ∂hi c and H (M) = = (hij ) with hij = where N dvec(M) ∂mj 1 H

M z z ½ H c−1 2 vec M . Hub = i i z M zi≤ ( ) = (mi) Nβ i Hub k i=1 ′ X h i Proof: One can first notice that H (M)vec(M) = 0r,1. c N H 1 2 zizi Indeed, since H(M) = H(αM) for all α > 0, the subspace + k ½zH Mc−1 z 2 , (41) Nβ H −1 i Hub i>k generated by the vector vec M is an iso- region. Therefore, i=1 " zi MHubzi # ( ) H X c 6

where ½ is the indicator function. Figure 2 depicts the RMSE of the DOA estimated with N The first summation corresponds to unweighted data which data for the SCM and for Huber’s estimate, when the additive are treated as in the SCM; the second one is associated to noise is K-distributed with shape parameter 0.1. A shape normalized data treated as outliers. In a complex Gaussian parameter close to 1 (& 0.9) indicates a distribution close context and when N tends to infinity, it may be shown that to the Gaussian distribution whereas it indicates an impulsive the proportion of data treated with the SCM is equal to q. noise when the parameter is close to 0 (. 0.1). Thus, the noise Moreover the choice of k2 and β according to (38) and (39), being quite impulsive in our example, we observe that the leads to a consistant M-estimator of the covariance matrix RMSE of Huber’s M-estimator is smaller than the SCM, the (σ =1 in equation (4)). latter giving worse results than in the Gaussian case. It points In the following simulations, q =0.75. out the fact that the SCM gives poor results as soon as the context is far from a Gaussian environment whereas Huber’s A. Asymptotic performance of DOA estimated by the MUSIC M-estimator is more robust and much more interesting in that method, with the SCM and Huber’s M-estimator case. Now let us turn to theorem IV.2. To illustrate our result, we consider a simulation using the MUltiple SIgnal Classification 0 10 SCM (MUSIC) method, which estimates the Directions Of Arrival Huber (DOAs) of a signal. We consider in this paper a single signal to detect. However, the multi-sources case can be similarly analyzed. Under this assumption, let us define H(M) the estimated DoAs obtained from the MUSIC pseudo-spectrum: −1 10

H(M)= θ. c RMSE A m =3 uniform linear array (ULA) with half wavelength sensorsc spacingb is used, which receives a Gaussian station- nary narrowband signal with DOA 20◦. The array output is corrupted by an additive noise which is firstly spatially white −2 10 Gaussian and secondly K-distributed with shape parameter 0.1. 0 50 100 150 200 250 300 350 400 450 500 Moreover, the SNR per sensor is 5dB and the N snapshots Number of snapshots are assumed to be independent. The MUSIC method uses the Fig. 2. One source DOA RMSE (m = 3 antennas) for Huber’s estimate and estimation of the covariance matrix with the N snapshots and the SCM, for K-distributed additive noise with a shape parameter ν = 0.1. here, the employed covariance matrix estimators are the SCM and the complex analogue of Huber’s M-estimator as defined in equation (41). Figure 1 depicts the Root Mean Square Error (RMSE) in degrees, of the DOA estimated with N data for the SCM and for Huber’s estimate, when the additive noise is white B. Asymptotic performance of the ANMF test with the SCM Gaussian. The RMSE of the DOA estimated with σ1 N data and Huber’s M-estimator with Huber’s estimate is also represented. We observe that for N large enough (N 40), this curve and the SCM one Let us give a second illustration of theorem IV.2. We ≥ overlap, as expected from theorem (IV.1). In this example, consider an adaptive radar receiving a vector y of length m. σ1 =1.067. The estimated covariance matrix of the environment is M and we try to detect signals of steering vector p. This steering

0.6 vector defines the DOA and speed of the target, usingc the SCM 0.5 Huber Doppler frequency. The ANMF test [29] is Huber with σ N data 1

0.4 pH M−1y 2 0.3 Λ(M y)= | | . (42) | (pH M−1y)(yH M−1y) c 0.2 c

RMSE in degrees c c Firstly, we have considered a Gaussian context and com- puted Λ(M y). In figure 3 the vertical scale represents the variance of |Λ obtained with the SCM and the complex ana- 0.1 0 50 100 150 200 250 300 350 400 450 500 logue ofc Huber’s M-estimator defined in (2). The horizontal Number of snapshots scale represents the number of samples used to estimate the covariance matrix. A third curve represents the variance of Λ Fig. 1. One source DOA RMSE (m = 3 antennas) for Huber’s estimate and the SCM, for spatially white Gaussian additive noise. for σ1N data. As one can see, it overlaps the SCM’s curve, illustrating theorem IV.2. The coefficient σ1 is equal to 1.067. 7

−2 10 APPENDIX SCM Huber Huber with σ N data Proof: Lemma IV.1 1

A. Asymptotic behavior of Mu and Mv Λ

−3 10 Let us set 1/2 1/2 c c • M = Mu Mu ,

Variance of u −1/2 −1/2 • Ru = Mu MuMu and −1/2 • kn = Mu un.

SincebRu is a consistentc estimate of Im, when N , we

−4 −→ ∞ 10 have R = I2 + ∆R , considering ∆R small. Thus we 0 100 200 300 400 500 600 700 800 900 1000 u m u u Number of snashots N have b

b T −1 T −1 Fig. 3. Variance on the ANMF detector for Huber’s estimate and the SCM u un Mu un = u kn Ru kn estimate, with spatially white Gaussian additive noise. T −1   = u k (I2 + ∆ R ) k . c n b m u n −1 Secondly, we have considered a K-distributed environment, A first order expansion of Ru gives (I2m + ∆Ru) ≈ with shape parameter firstly equal to 0.1 and then 0.01 for a I2m ∆Ru which leads to − more impulsive noise. The figure 4 which scales are the same b u kT R−1k = u k 2 kT ∆R k (43) as in figure 3, brings once again to our minds that the SCM is n u n k nk − n u n not robust in a non-Gaussian context contrary to Huber’s M-   2 ′ 2 T = u kn u kn kn ∆Rukn estimator. Indeed, the more the noise differs from a Gaussian b k k − k k = a + b kT ∆R k , noise, the more the detector’s variance is deteriorated in that n n n u n  2 ′ 2 case while it still gives good results with Huber’s M-estimator. with an = u kn and bn = u kn . From equation (17), we obtaink k − k k

0   10 N N var(ΛH u b), ν = 0.01 1 1 ν . T T T var(ΛSCM), = 0 01 I2m+∆Ru = anknkn + bn kn ∆Rukn knkn . var(ΛH u b), ν = 0.1 N N −1 n=1 n=1 10 var(ΛSCM), ν = 0.1 X X  T T Since vec(knk ) = (kn kn) and k ∆Rukn =

Λ n n vec kT ∆R k = (k k⊗ )T vec(∆R ), one has the −2 n u n n n u 10 following equation: ⊗  Variance of N −3 10 1 vec(I2 )+ vec(∆R )= a k k m u N n n ⊗ n n=1 X −4 N 10 1 0 100 200 300 400 500 600 700 800 900 1000 + b (k k )T vec(∆R )(k k ). (44) Number of snapshots N N n n ⊗ n u n ⊗ n n=1 X Fig. 4. Variance on the ANMF detector with the Huber’s estimate and the This leads to SCM, for K-distributed additive noise with various shape parameters ν = 0.01 and nu = 0.1. N 1 T I4 2 b (k k )(k k ) vec(∆R ) m − N n n ⊗ n n ⊗ n u VI. CONCLUSION n=1 ! X N In this paper we have analyzed the statistical properties of 1 = an(kn kn) vec(I2m). (45) complex M-estimators of the scatter matrix in the framework N ⊗ − n=1 of complex elliptically distributed data. Firstly, using existing X Let us denote results for real M-estimators, we have derived the asymptotic N 1 T covariance in the complex case. Simulations have checked α = I4 2 b (k k )(k k ) . Then the N m − N n n ⊗ n n ⊗ n that when the number of samples increases, the M-estimator n=1 ! covariance tends to its theoretical asymptotic value. Secondly, previous equation isX equivalent to we have extended an interesting property of real M-estimators N −1 1 to the complex case. This property states that the asymptotic vec(∆R )= α a (k k ) vec(I2 ) . u N N n n ⊗ n − m distributions of any homogeneous function of degree zero n=1 ! of M-estimates and Wishart matrices, are the same up to a X (46) scale factor. This result has many potential applications in In (46) we have a.s T performance analysis of array processing algorithms based on • αN α = I4m2 E[b (k k)(k k) ] where k → − ⊗ ′ ⊗ 2 ∼ M-estimates of the covariance matrix. (02m 1, σ I2 ,gz) and b = u k E , r m − k k   8

N 1 a.s Then using the vec operator, this equation leads to • a (k k ) E [a k k] = N n n ⊗ n → ⊗ n=1 vec(I2 )+ vec(∆R ) E Xu k 2 k k = vec E u k 2 kkT = m R k k k ⊗ k k k N N vec (I2m) using (4) with Mu and replacing un by 1 1 1/2      = ankn kn + anTrkn Trkn M k . 2N ⊗ 2N ⊗ u n n=1 n=1 X X Now let us denote wN = N N 1 T 1 + bn(kn kn) vec(∆RR)(kn kn) √N an(kn kn) vec(I2m) . We have 2N ⊗ ⊗ n=1 N =1 ⊗ − ! n XN d X 1 T wN w, where w follows a zero mean Gaussian + b (Trk Trk ) vec(∆R )(Trk Trk ). −→ 2N n n ⊗ n R n ⊗ n distribution. n=1 Consequently, the Slutsky theorem gives X √Nvec(∆R )= α−1w d α−1w. (47) This is equivalent to u N N −→ Moreover, one can notice that N 1 T I4 2 b (k k )(k k ) 1/2 1/2 1/2 1/2 m n n n n n (M M )vec(∆Ru) = vec(M ∆RuM ) − 2N ⊗ ⊗ − u ⊗ u u u n=1 N X = vec(Mu Mu), 1 T T − b (Tr Tr)(k k )(k k ) (Tr Tr) 2N n ⊗ n ⊗ n n ⊗ n ⊗ which gives, taking into account equation (47), n=1 ! c X √ d 1/2 1/2 −1 vec(∆RR) Nvec(Mu Mu) (Mu Mu )α w. (48) − −→ ⊗ 1 N Using equation (18), we also have = a (k k ) c 2N n n ⊗ n n=1 √Nvec(M M )= √N(Tr Tr)vec(M M ) X v − v ⊗ u − u 1 N d 1/2 1/2 −1 + an(Tr Tr)(kn kn) vec(I2m). (Tr Tr)(Mu Mu )α w. (49) 2N ⊗ ⊗ − c −→ ⊗ ⊗ c n=1 X 0m,m Im where Tr = . I 0− which leads to  m m,m N ∼−1 1 vec(∆R )= α a (k k ) B. Asymptotic behavior of MR R N 2N n n ⊗ n −1/2 −1/2 n=1 Let us denote RR = MR MRMR . Since MR = Mu, N X −1/2 c 1 one has kn = MR un. + a (Tr Tr)(k k ) vec(I2 ) , 2N n ⊗ n ⊗ n − m For all matricesb of the formcf(A), Trf(A) = f(A)Tr. n=1 ! X Therefore, since MR = f(Mc), TrMR = MRTr. One 1/2 1/2 N T T T ∼ 1 has MR = Tr MRTr = Tr MR Tr Tr MR Tr . T where αN = I4m2 bn(kn kn)(kn kn) 1/2 T 1/2 1/2 1/2 − 2N ⊗ ⊗ Therefore M = Tr M  Tr and TrM = M Tr. n=1 R R R R X −1/2 −1/2 1 This leads to Tr kn = TrMR un = MR vn. N T T n=1 bn(Tr Tr)(kn kn)(kn kn) (Tr Tr) . When N , since R is a consistent estimate of I2 , −2N ⊗ ⊗ ⊗ ⊗ −→ ∞ R m ! RR = I2m + ∆RR, with ∆RR small. Similarly to the first UsingP previous notation wN , we obtain part of the proof on has, b 1 ∼−1 b −1 −1 2 u uT M u = u vT M v vec(∆RR) = αN (I4m + (Tr Tr)) wN n R n n R n 2√N ⊗ 1    T −  ∼ = u kn RR kn a.s c c One can notice that αN α since the Trkn have the 1  T  −→ ∼− = an + bnkn ∆RRkn. same distribution as the kn. Moreover αN (Tr Tr) = (Tr b ∼−1 1/2 1/2 ⊗ 1/2 ⊗ Thus, deriving from equation (16) we obtain Tr) αN and (Mu Mu )(Tr Tr) = (Tr Tr)(Mu 1/2 ⊗ ⊗ ⊗ ⊗ Mu ). Therefore we obtain I2m + ∆RR N N 1 T 1 T T = anknk + an Trknk Tr 1 2N n 2N n √Nvec(M M ) d (M1/2 M1/2)α−1 n=1 n=1 R − R −→ 2 u ⊗ u XN X 1/2 1/2 −1 1 Tr Tr M M w + b kT ∆R k k kT c +( )( u u )α . 2N n n R n n n ⊗ ⊗ n=1  X N  This leads to the conclusion that MR shares the same asymp- 1 T T T T + bn kn Tr ∆RRTrkn Trknkn Tr . 1 2N totic distribution as Mu + Mv . n=1   2 c X   c c 9

REFERENCES [27] M. Bilodeau and D. Brenner, Theory of . Springer Verlag, 1999. [1] D. Tyler, “Robustness and efficiency properties of scatter matrices,” [28] F. Pascal, P. Forster, J.-P. Ovarlez, and P. Larzabal, “Performance Biometrika, vol. 70, no. 2, p. 411, 1983. analysis of covariance matrix estimates in impulsive noise,” IEEE Trans.- [2] S. M. Kay, Fundamentals of Statistical Signal Processing - Detection SP, vol. 56, no. 6, pp. 2206–2217, June 2008. Theory. Prentice-Hall PTR, 1998, vol. 2. [29] E. Conte, A. De Maio, and G. Ricci, “Covariance matrix estimation for [3] M. Mahot, P. Forster, J.-P. Ovarlez, and F. Pascal, “Robustness analysis adaptive cfar detection in compound-gaussian clutter,” IEEE Trans.-AES, of covariance matrix estimates,” European Signal Processing Conference vol. 38, no. 2, pp. 415–426, April 2002. (EUSIPCO), Aalborg, Denmark, August 2010. [4] P. J. Huber and E. M. Ronchetti, Robust statistics. John Wiley & Sons Inc, 2009. [5] F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw, and W. A. Stahel, Robust Statistics: The Approach Based on Influence Functions, ser. Wiley Series in Probability and Statistics. John Wiley & Sons, 1986. [6] R. A. Maronna, D. R. Martin, and J. V. Yohai, Robust Statistics: Theory and Methods, ser. Wiley Series in Probability and Statistics. John Wiley & Sons, 2006. [7] P. J. Huber, “Robust estimation of a location parameter,” The Annals of Mathematical Statistics, vol. 35, no. 1, pp. 73–101, 1964. [8] R. A. Maronna, “Robust M-estimators of multivariate location and scatter,” Annals of Statistics, vol. 4, no. 1, pp. 51–67, January 1976. [9] D. Kelker, “Distribution theory of spherical distributions and a location- scale parameter generalization,” Sankhy¯a: The Indian Journal of Statis- tics, Series A, vol. 32, no. 4, pp. 419–430, 1970. [10] S. Watts, “Radar detection prediction in sea clutter using the compound K-distribution model,” IEE Proceeding, Part. F, vol. 132, no. 7, pp. 613–620, December 1985. [11] E. Conte, M. Longo, M. Lops, and S. Ullo, “Radar detection of signals with unknown parameters in K-distributed clutter,” Radar and Signal Processing, IEE Proceedings F, vol. 138, no. 2, pp. 131 –138, Apr. 1991. [12] F. Gini, M. V. Greco, A. Farina, and P. Lombardo, “Optimum and mismatched detection against K-distributed plus Gaussian clutter,” IEEE Trans.-AES, vol. 34, no. 3, pp. 860–876, July 1998. [13] D. Tyler, “A distribution-free M-estimator of multivariate scatter,” The Annals of Statistics, vol. 15, no. 1, pp. 234–251, 1987. [14] F. Pascal, Y. Chitour, J.-P. Ovarlez, P. Forster, and P. Larzabal, “Co- variance structure maximum likelihood estimates in compound gaussian noise : Existence and algorithm analysis,” IEEE Trans.-SP, vol. 56, no. 1, pp. 34–48, January 2008. [15] E. Ollila, D. E. Tyler, V. Koivunen, and H. V. Poor, “Complex elliptically symmetric distributions: Survey, new results and applications,” Signal Processing, IEEE Transactions on, vol. 60, no. 11, pp. 5597 –5625, nov. 2012. [16] E. Ollila and V. Koivunen, “Influence function and asymptotic efficiency of scatter matrix based array processors: Case mvdr beamformer,” Signal Processing, IEEE Transactions on, vol. 57, no. 1, pp. 247–259, 2009. [17] ——, “Robust antenna array processing using M-estimators of pseudo- covariance,” in Proc. 14th IEEE Int. Symp. Personal, Indoor, Mobile Radio Commun.(PIMRC), 2003, pp. 7–10. [18] ——, “Influence functions for array covariance matrix estimators,” Proc. IEEE Workshop on Statistical Signal Processing (SSP),ST Louis, MO, pp. 445–448, October 2003. [19] E. Ollila, L. Quattropani, and V. Koivunen, “Robust space-time scatter matrix estimator for broadband antenna arrays,” Vehicular Technology Conference, 2003. VTC 2003-Fall. 2003 IEEE 58th, vol. 1, pp. 55 – 59 Vol.1, oct. 2003. [20] R. Couillet, F. Pascal, and J. Silverstein, “Robust m-estimation for array processing: A random matrix approach,” arXiv preprint arXiv:1204.5320, 2012. [21] D. Tyler, “Radial estimates and the test for sphericity,” Biometrika, vol. 69, no. 2, p. 429, 1982. [22] S. Kraut, L. L. Scharf, and L. T. Mc Whorter, “Adaptive subspace detectors,” IEEE Trans.-SP, vol. 49, no. 1, pp. 1–16, January 2001. [23] S. Kraut and L. Scharf, “The cfar adaptive subspace detector is a scale- invariant glrt,” Signal Processing, IEEE Transactions on, vol. 47, no. 9, pp. 2538–2541, 1999. [24] A. Van den Bos, “The multivariate complex normal distribution-a generalization,” Information Theory, IEEE Transactions on, vol. 41, no. 2, pp. 537–539, 1995. [25] D. Tyler, “Some results on the existence, uniqueness, and computation of the M-estimates of multivariate location and scatter,” SIAM Journal on Scientific and Statistical Computing, vol. 9, p. 354, 1988. [26] J. T. Kent and D. E. Tyler, “Redescending M-estimates of multivariate location and scatter,” Annals of Statistics, vol. 19, no. 4, pp. 2102–2119, December 1991.