1 Coarrays, MUSIC, and the Cramer´ Rao Bound Mianzhi Wang, Student Member, IEEE, and Arye Nehorai, Life Fellow, IEEE

Abstract—Sparse linear arrays, such as co-prime arrays and ESPRIT [17]) via first-order perturbation analysis. However, nested arrays, have the attractive capability of providing en- these results are based on the physical array model and hanced degrees of freedom. By exploiting the coarray structure, make use of the statistical properties of the original sample an augmented sample covariance matrix can be constructed and MUSIC (MUtiple SIgnal Classification) can be applied to covariance matrix, which cannot be applied when the coarray identify more sources than the number of sensors. While such model is utilized. In [18], Gorokhov et al. first derived a a MUSIC algorithm works quite well, its performance has not general MSE expression for the MUSIC algorithm applied been theoretically analyzed. In this paper, we derive a simplified to matrix-valued transforms of the sample covariance matrix. asymptotic mean square error (MSE) expression for the MUSIC While this expression is applicable to coarray-based MUSIC, algorithm applied to the coarray model, which is applicable even if the source number exceeds the sensor number. We show that its explicit form is rather complicated, making it difficult to the directly augmented sample covariance matrix and the spatial conduct analytical performance studies. Therefore, a simpler smoothed sample covariance matrix yield the same asymptotic and more revealing MSE expression is desired. MSE for MUSIC. We also show that when there are more sources In this paper, we first review the coarray signal model than the number of sensors, the MSE converges to a positive commonly used for sparse linear arrays. We investigate two value instead of zero when the signal-to-noise ratio (SNR) goes to infinity. This finding explains the “saturation” behavior of common approaches to constructing the augmented sample the coarray-based MUSIC algorithms in the high SNR region covariance matrix, namely, the direct augmentation approach observed in previous studies. Finally, we derive the Cramer-Rao´ (DAA) [19], [20] and the spatial smoothing approach [9]. We bound (CRB) for sparse linear arrays, and conduct a numerical show that MUSIC yields the same asymptotic estimation error study of the statistical efficiency of the coarray-based estimator. for both approaches. We are then able to derive an explicit Experimental results verify theoretical derivations and reveal the complex efficiency pattern of coarray-based MUSIC algorithms. MSE expression that is applicable to both approaches. Our MSE expression has a simpler form, which may facilitate the performance analysis of coarray-based MUSIC algorithms. We Index Terms—MUSIC, Cramer-Rao´ bound, coarray, sparse linear arrays, statistical efficiency observe that the MSE of coarray-based MUSIC depends on both the physical array geometry and the coarray geometry. We show that, when there are more sources than the number I.INTRODUCTION of sensors, the MSE does not drop to zero even if the SNR STIMATING source directions of arrivals (DOAs) using approaches infinity, which agrees with the experimental results E sensors arrays plays an important role in the field of in previous studies. Next, we derive the CRB of DOAs that is array . For uniform linear arrays (ULA), it is applicable to sparse linear arrays. We notice that when there widely known that traditional subspace-based methods, such are more sources than the number of sensors, the CRB is as MUSIC, can resolve up to N −1 uncorrelated sources with strictly nonzero as the SNR goes to infinity, which is consistent N sensors [1]–[3]. However, for sparse linear arrays, such with our observation on the MSE expression. It should be as minimal redundancy arrays (MRA) [4], it is possible to mentioned that during the review process of this paper, Liu construct an augmented covariance matrix by exploiting the et al. and Koochakzadeh et al. also independently derived the coarray structure. We can then apply MUSIC to the augmented CRB for sparse linear arrays in [21], [22]. In this paper, we 2 covariance matrix, and up to O(N ) sources can be resolved provide a more rigorous proof the CRB’s limiting properties with only N sensors [4]. in high SNR regions. We also include various statistical arXiv:1605.03620v2 [stat.AP] 13 Dec 2016 Recently, the development of co-prime arrays [5]–[8] and efficiency analysis by utilizing our results on MSE, which is nested arrays [9]–[11], has generated renewed interest in not present in [21], [22]. Finally, we verify our analytical MSE sparse linear arrays, and it remains to investigate the per- expression and analyze the statistical efficiency of different formance of these arrays. The performance of the MUSIC sparse linear arrays via numerical simulations. We we observe estimator and its variants (e.g., root-MUSIC [12], [13]) was good agreement between the empirical MSE and the analytical thoroughly analyzed by Stoica et al. in [2], [14] and [15]. MSE, as well as complex efficiency patterns of coarray-based The same authors also derived the asymptotic MSE expression MUSIC. of the MUSIC estimator, and rigorously studied its statistical Throughout this paper, we make use of the following efficiency. In [16], Li et al. derived a unified MSE expression notations. Given a matrix A, we use AT , AH , and A∗ for common subspace-based estimators (e.g., MUSIC and to denote the transpose, the Hermitian transpose, and the This work was supported by the AFOSR under Grant FA9550-11-1-0210 conjugate of A, respectively. We use Aij to denote the (i, j)- and the ONR Grant N000141310050. th element of A, and ai to denote the i-th column of A. M. Wang and A. Nehorai are with the Preston M. Green Depart- If A is full column rank, we define its pseudo inverse as ment of Electrical and Systems Engineering, Washington University in St. † H −1 H Louis, St. Louis, MO 63130 USA (e-mail: [email protected]; neho- A = (A A) A . We also define the projection matrix ⊥ † [email protected]) onto the null space of A as ΠA = I − AA . Let A = 2

M×N d [a1 a2 ... aN ] ∈ C , and we define the vectorization 0 operation as vec(A) = [aT aT ... aT ]T , and mat (·) as (a) 1 2 N M,N ULA of 2M − 1 sensors its inverse operation. We use ⊗ and to denote the Kronecker v (b) product and the Khatri-Rao product (i.e., the column-wise −Mvd0 Mvd0 Kronecker product), respectively. We denote by R(A) and (c) I(A) the real and the imaginary parts of A. If A is a square st matrix, we denote its trace by tr(A). In addition, we use TM 1 subarray of size Mv to denote a M × M permutation matrix whose anti-diagonal elements are one, and whose remaining elements are zero. Fig. 1. A coprime array with sensors located at [0, 2, 3, 4, 6, 9]λ/2 and its coarray: (a) physical array; (b) coarray; (c) central ULA part of the coarray. We say a complex vector z ∈ CM is conjugate symmetric if ∗ TM z = z . We also use ei to denote the i-th natural base vector in Euclidean space. For instance, Aei yields the i-th T Because Dco is symmetric, this virtual ULA is centered at column of A, and ei A yields the i-th row of A. the origin. The sensor locations of the virtual ULA are given by [−Mv + 1, −Mv + 2,..., 0,...,Mv − 1]d0, where Mv is II.THE COARRAY SIGNAL MODEL defined such that 2Mv −1 is the size of the virtual ULA. Fig. 1 We consider a linear sparse array consisting of M sensors provides an illustrative example of the relationship between whose locations are given by D = {d1, d2, . . . , dM }. Each the physical array and the corresponding virtual ULA. The sensor location di is chosen to be the integer multiple of the observation vector of the virtual ULA is given by smallest distance between any two sensors, denoted by d0. 2 Therefore we can also represent the sensor locations using z = F r = Acp + σnF i, (5) ¯ ¯ ¯ ¯ ¯ the integer set D = {d1, d2,..., dM }, where di = di/d0 for where F is the coarray selection matrix, whose detailed defini- i = 1, 2,...,M . Without loss of generality, we assume that tion is provided in Appendix A, and Ac represents the steering the first sensor is placed at the origin. We consider K narrow- matrix of the virtual array. The virtual ULA can be divided θ , θ , . . . , θ band sources 1 2 K impinging on the array from the into Mv overlapping uniform subarrays of size Mv. The output λ far field. Denoting as the wavelength of the carrier frequency, of the i-th subarray is given by zi = Γiz for i = 1, 2,...,Mv, we can express the steering vector for the k-th source as where Γi = [0Mv×(i−1) IMv×Mv 0Mv×(Mv−i)] represents the  jd¯ φ jd¯ φ T selection matrix for the i-th subarray. a(θk) = 1 e 2 k ··· e M k , (1) Given the outputs of the Mv subarrays, the augmented where φk = (2πd0 sin θk)/λ. Hence the received signal covariance matrix of the virtual array Rv is commonly con- vectors are given by structed via one of the following methods [9], [20]:

y(t) = A(θ)x(t) + n(t), t = 1, 2,...,N, (2) Rv1 = [zMv zMv−1 ··· z1], (6a)

Mv where A = [a(θ1) a(θ2) ... a(θK )] denotes the array steering 1 X H Rv2 = zizi , (6b) matrix, x(t) denotes the source signal vector, n(t) denotes Mv additive noise, and N denotes the number of snapshots. In the i=1 following discussion, we make the following assumptions: where method (6a) corresponds to DAA , while method (6b) A1 The source signals follow the unconditional model [15] corresponds to the spatial smoothing approach. and are uncorrelated white circularly-symmetric Gaus- Following the results in [9] and [20], Rv1 and Rv2 are sian. related via the following equality: A2 The source DOAs are distinct (i.e., θk 6= θl ∀k 6= l). 1 2 1 H 2 2 Rv2 = Rv1 = (AvPAv + σnI) , (7) A3 The additive noise is white circularly-symmetric Gaussian Mv Mv and uncorrelated from the sources. where Av corresponds to the steering matrix of a ULA whose A4 The is no temporal correlation between each snapshot. sensors are located at [0, 1,...,Mv − 1]d0. If we design a Under A1–A4, the sample covariance matrix is given by sparse linear array such that Mv > M, we immediately gain H 2 enhanced degrees of freedom by applying MUSIC to either R = AP A + σnI, (3) Rv1 or Rv2 instead of R in (3). For example, in Fig. 1, we where P = diag(p1, p2, . . . , pK ) denotes the source covari- have a co-prime array with Mv = 8 > 6 = M. Because 2 ance matrix, and σn denotes the variance of the additive noise. MUSIC is applicable only when the number of sources is By vectorizing R, we can obtain the following coarray model: less than the number of sensors, we assume that K < Mv 2 throughout the paper. This assumption, combined with A2, r = Adp + σni, (4) ensures that Av is full column rank. ∗ T where Ad = A A, p = [p1, p2, . . . , pK ] , and i = vec(I). It should be noted that the elements in (6a) are obtained It has been shown in [9] that Ad corresponds to the steering via linear operations on the elements in R, and those in (6b) matrix of the coarray whose sensor locations are given by are obtained via quadratic operations. Therefore the statistical Dco = {dm − dn|1 ≤ m, n ≤ M}. By carefully selecting properties of Rv1 and Rv2 are different from that of R. rows of (A∗ A), we can construct a new steering matrix Consequently, traditional performance analysis for the MUSIC representing a virtual ULA with enhanced degrees of freedom. algorithm based on R cannot be applied to the coarray-based 3

MUSIC. For brevity, we use the term direct augmentation B = [A¯v h]. It follows that B is also not full column based MUSIC (DA-MUSIC), and the term spatial smoothing rank. Because B is a square matrix, it is also not full row M based MUSIC (SS-MUSIC) to denote the MUSIC algorithm rank. Therefore there exists some non-zero c ∈ Cv such H jφ applied to Rv1 and Rv2, respectively. In the following section, that c B = 0. Let tl = e l for l = 1, 2,...,Mv, where we will derive a unified analytical MSE expression for both φl = (2πd0 sin θk)/λ. We can express B as DA-MUSIC and SS-MUSIC.  1 1 ··· 1 0 

 t1 t2 ··· tMv−1 1  III.THE MSE OF COARRAY-BASED MUSIC  2 2 2   t1 t2 ··· tM −1 2tk   v  . In practice, the real sample covariance matrix R is  ......  ˆ  . . . . .  unobtainable, and its maximum-likelihood estimate R = Mv−1 Mv−1 Mv−1 Mv−2 N t t ··· t (M − 1)t P H 1 2 Mv−1 v k 1/N t=1 x(t)x(t) is used. Therefore z, Rv1, and Rv2 are M also replaced with their estimated versions zˆ, Rˆv1, and Rˆv2. P v l−1 We define the complex polynomial f(x) = l=1 clx . It ˆ T Due to the estimation error ∆R = R−R, the estimated noise can be observed that c B = 0 is equivalent to f(tl) = 0 0 eigenvectors will deviate from the true one, leading to DOA for l = 1, 2,...,Mv − 1, and f (tk) = 0. By construction, θl estimation errors. are distinct, so tl are Mv − 1 different roots of f(x). Because In general, the eigenvectors of a perturbed matrix are not c 6= 0, f(x) is not a constant-zero polynomial, and has at well-determined [23]. For instance, in the very low SNR most Mv − 1 roots. Therefore each root tl has a multiplicity 0 scenario, ∆R may cause a subspace swap, and the estimated of at most one. However, f (tk) = 0 implies that tk has a noise eigenvectors will deviate drastically from the true ones multiplicity of at least two, which contradicts the previous [24]. Nevertheless, as shown in [16], [18] and [25], given conclusion and completes the proof of βk 6= 0. enough samples and sufficient SNR, it is possible to obtain We now show that ξk 6= 0. By the definition of F in the closed-form expressions for DOA estimation errors via Appendix A, each row of F has at least one non-zero element, first-order analysis. Following similar ideas, we are able to and each column of F has at most one non-zero element. derive the closed-form error expression for DA-MUSIC and Hence F T x = 0 for some x ∈ C2Mv−1 if and only of x = 0. T SS-MUSIC, as stated in Theorem 1. It suffices to show that Γ (βk ⊗ αk) 6= 0. By the definition T ˜ ˆ(1) ˆ(2) of Γ, we can rewrite Γ (βk ⊗ αk) as Bkαk, where Theorem 1. Let θk and θk denote the estimated values of the k-th DOA by DA-MUSIC and SS-MUSIC, respectively.   βkMv 0 ··· 0 Let ∆r = vec(Rˆ − R). Assume the signal subspace and the βk(Mv−1) βkMv ··· 0    noise subspace are well-separated, so that ∆r does not cause  . . .. .  a subspace swap. Then  . . . .  ˜   Bk =  βk1 βk2 ··· βkMv  , ˆ(1) . ˆ(2) . −1 T   θ − θk = θ − θk = −(γkpk) R(ξk ∆r), (8)  0 βk1 ··· βk(M −1) k k  v  .  . . .. .  where = denotes asymptotic equality, and  . . . .  T T 0 0 ··· βk1 ξk = F Γ (βk ⊗ αk), (9a) T T † and βkl is the l-th element of βk. Because βk 6= 0k and αk = −ek Av, (9b) K < Mv, B˜k is full column rank. By the definition of pseudo β = Π⊥ a˙ (θ ), (9c) ˜ k Av v k inverse, we know that αk 6= 0. Therefore Bkαk 6= 0, which H ⊥ γ = a˙ (θ )Π a˙ (θ ), completes the proof of ξk 6= 0. k v k Av v k (9d) Γ = [ΓT ΓT ··· ΓT ]T , (9e) One important implication of Theorem 1 is that DA-MUSIC Mv Mv−1 1 and SS-MUSIC share the same first-order error expression, ∂av(θk) a˙ v(θk) = . (9f) despite the fact that Rv1 is constructed from the second- ∂θk order statistics, while Rv2 is constructed from the fourth-order Proof: See Appendix B. statistics. Theorem 1 enables a unified analysis of the MSEs of Theorem 1 can be reinforced by Proposition 1. βk 6= 0 DA-MUSIC and SS-MUSIC, which we present in Theorem 2. ensures that γ−1 exists and (8) is well-defined, while ξ 6= 0 k k Theorem 2. Under the same assumptions as in Theorem 1, ensures that (8) depends on ∆r and cannot be trivially zero. the asymptotic second-order statistics of the DOA estimation Proposition 1. βk, ξk 6= 0 for k = 1, 2,...,K. errors by DA-MUSIC and SS-MUSIC share the same form: H T Proof: We first show that β 6= 0 by contradiction. . R[ξ (R ⊗ R )ξk2 ] k [(θˆ − θ )(θˆ − θ )] = k1 . (10) Assume β = 0. Then Π⊥ Da (θ ) = 0, where D = E k1 k1 k2 k2 k Av v k Npk1 pk2 γk1 γk2 diag(0, 1,...,M − 1). This implies that Da (θ ) lies in v v k Proof: See Appendix C. the column space of A . Let h = e−jφk Da (θ ). We v v k By Theorem 2, it is straightforward to write the unified immediately obtain that [A h] is not full column rank. We v asymptotic MSE expression as now add Mv − K − 1 distinct DOAs in (−π/2, π/2) that are H T different from θ1, . . . , θK , and construct an extended steering ξk (R ⊗ R )ξk ¯ (θk) = 2 2 . (11) matrix Av of the Mv − 1 distinct DOAs, θ1, . . . , θMv−1. Let Npkγk 4

Therefore the MSE1 depends on both the physical array will converge to zero as SNR approaches infinity. We know geometry and the coarray geometry. The physical array ge- that both DA-MUSIC and SS-MUSIC will be outperformed by ometry is captured by A, which appears in R ⊗ RT . The traditional MUSIC in high SNR regions when 2 ≤ K < M. coarray geometry is captured by Av, which appears in ξk Therefore, we suggest using DA-MUSIC or SS-MUSIC only and γk. Therefore, even if two arrays share the same coarray when K ≥ M. geometry, they may not share the same MSE because their Corollary 2. Following the same assumptions in Corollary 1, physical array geometry may be different. 1) When K = 1, lim (θ ) = 0; It can be easily observed from (11) that (θk) → 0 as N → p¯→∞ k 2) When 2 ≤ K < M, lim (θ ) ≥ 0; ∞. However, because pk appears in both the denominator and p¯→∞ k numerator in (11), it is not obvious how the MSE varies with 3) When K ≥ M, limp¯→∞ (θk) > 0. 2 respect to the source power pk and noise power σn. Let p¯k = Proof: The right-hand side of (13) can be expanded into 2 pk/σn denote the signal-to-noise ratio of the k-th source. Let ¯ ¯ ¯ H K K P = diag(¯p1, p¯2,..., p¯K ), and R = APA + I. We can 1 X X H ∗ 2 kξ [a(θm) ⊗ a (θn)]k . then rewrite (θk) as Nγ2 k 2 k m=1 n=1 ξH (R¯ ⊗ R¯ T )ξ k k By the definition of F , F [a(θ ) ⊗ a∗(θ )] becomes (θk) = 2 2 . (12) m m Np¯kγk [ej(Mv−1)φm , ej(Mv−2)φm , . . . , e−j(Mv−1)φm ]. Hence the MSE depends on the SNRs instead of the absolute values of p or σ2. To provide an intuitive understanding how ∗ ∗ k n Hence ΓF [a(θm) ⊗ a (θm)] = av(θm) ⊗ av(θm). Observe SNR affects the MSE, we consider the case when all sources that have the same power. In this case, we show in Corollary 1 ξH [a(θ ) ⊗ a∗(θ )] =(β ⊗ α )H (a (θ ) ⊗ a∗(θ )) that the MSE asymptotically decreases as the SNR increases. k m m k k v m v m H H ∗ =(βk av(θm))(αk av(θm)) Corollary 1. Assume all sources have the same power p. Let H ⊥ H ∗ 2 =(a˙ v (θk)ΠA av(θm))(αk av(θm)) p¯ = p/σn denote the common SNR. Given sufficiently large N, v the MSE (θk) decreases monotonically as p¯ increases, and =0.

1 H ∗ 2 We can reduce the right-hand side of (13) into lim (θk) = kξk (A ⊗ A )k2. (13) p¯→∞ Nγ2 k 1 X H ∗ 2 kξ [a(θm) ⊗ a (θn)]k . Proof: The limiting expression can be derived straightfor- Nγ2 k 2 k 1≤m,n≤K wardly from (12). For monotonicity, without loss of generality, m6=n 2 let p = 1, so p¯ = 1/σn. Because f(x) = 1/x is monotonically Therefore when K = 1, the limiting expression is exactly decreasing on (0, ∞), it suffices to show that (θk) increases 2 zero. When 2 ≤ K < M, the limiting expression is not monotonically as σn increases. Assume 0 < s1 < s2, and we H ∗ necessary zero because when m 6= n, ξk [a(θm) ⊗ a (θn)] have is not necessarily zero. 1 H When K ≥ M, A is full row rank. Hence A ⊗ A∗ is also (θk)| 2 − (θk)| 2 = ξ Qξk, σn=s2 σn=s1 2 k Nγk full row rank. By Proposition 1 we know that ξk 6= 0, which H H implies that (θk) is strictly greater than zero. where Q = (s2 − s1)[(AA ) ⊗ I + I ⊗ (AA ) + (s2 + H H s1)I]. Because AA is positive semidefinite, both (AA )⊗ I and I ⊗ (AAH ) are positive semidefinite. Combined with IV. THE CRAMER´ -RAO BOUND our assumption that 0 < s1 < s2, we conclude that Q is The CRB for the unconditional model (2) has been well positive definite. By Proposition 1 we know that ξk 6= 0. studied in [15], but only when the number of sources is H Therefore ξk Qξk is strictly greater than zero, which implies less than the number of sensors and no prior knowledge of 2 the MSE monotonically increases as σn increases. P is given. For the coarray model, the number of sources Because both DA-MUSIC and SS-MUSIC work also in can exceed the number of sensors, and P is assumed to cases when the number of sources exceeds the number of be diagonal. Therefore, the CRB derived in [15] cannot be sensors, we are particularly interested in their limiting per- directly applied. Based on [26, Appendix 15C], we provide formance in such cases. As shown in Corollary 2, when an alternative CRB based on the signal model (2), under K ≥ M, the corresponding MSE is strictly greater than assumptions A1–A4. zero, even though the SNR approaches infinity. This corollary For the signal model (2), the parameter vector is defined by explains the “saturation” behavior of SS-MUSIC in the high 2 T SNR region as observed in [8] and [9]. Another interesting η = [θ1, . . . , θK , p1, . . . , pk, σn] , (14) implication of Corollary 2 is that when 2 ≤ K < M, the and the (m, n)-th element of the Fisher information matrix limiting MSE is not necessarily zero. Recall that in [2], it (FIM) is given by [15], [26] was shown that the MSE of the traditional MUSIC algorithm " # 1 ∂R −1 ∂R −1 For brevity, when we use the acronym “MSE” in the following discussion, FIMmn = N tr R R . (15) we refer to the asymptotic MSE, (θk), unless explicitly stated. ∂ηm ∂ηn 5

Observe that tr(AB) = vec(AT )T vec(B), and that By (18) we know that the rank of ∂r/∂η is closely related to T vec(AXB) = (B ⊗ A) vec(X). We can rewrite (15) as Ad, the coarray steering matrix. Therefore CRBθ is not valid " #H for an arbitrary number of sources, because Ad may not be ∂r ∂r full column rank when too many sources are present. FIM = N (RT ⊗ R)−1 . mn ∂η ∂η m n Proposition 2. Assume all sources have the same power p, 2 Denote the derivatives of r with respect to η as and ∂r/∂η is full column rank. Let p¯ = p/σn. " # (1) If K < M, and lim CRB exists, it is zero under ∂r ∂r ∂r ∂r ∂r ∂r p¯→∞ θ = ··· ··· . (16) mild conditions. ∂η ∂θ ∂θ ∂p ∂p ∂σ2 1 K 1 K n (2) If K ≥ M, and limp¯→∞ CRBθ exists, it is positive The FIM can be compactly expressed by definite. " #H Proof: See Appendix D. ∂r ∂r FIM = (RT ⊗ R)−1 . (17) While infinite SNR is unachievable from a practical stand- ∂η ∂η point, Proposition 2 gives some useful theoretical implications. When K < M, the limiting MSE (13) in Corollary 1 is According to (4), we can compute the derivatives in (16) and not necessarily zero. However, Proposition 2 reveals that the obtain ∂r CRB may approach zero when SNR goes to infinity. This = A˙ PA i , (18) ∂η d d observation implies that both DA-MUSIC and SS-MUSIC may have poor statistical efficiency in high SNR regions. When ˙ ˙ ∗ ∗ ˙ where Ad = A A + A A, Ad and i follow the same K ≥ M, Proposition 2 implies that the CRB of each DOA definitions as in (4), and will converge to a positive constant, which is consistent with " # ∂a(θ ) ∂a(θ ) ∂a(θ ) Corollary 2. A˙ = 1 2 ··· K . ∂θ1 ∂θ2 ∂θK V. NUMERICAL ANALYSIS Note that (18) can be partitioned into two parts, specifically, In this section, we numerically analyze of DA-MUSIC and the part corresponding to DOAs and the part corresponding to SS-MUSIC by utilizing (11) and (19). We first verify the the source and noise powers. We can also partition the FIM. MSE expression (10) introduced in Theorem 2 through Monte Because R is positive definite, (RT ⊗ R)−1 is also positive Carlo simulations. We then examine the application of (8) definite, and its square root (RT ⊗ R)−1/2 also exists. Let in predicting the resolvability of two closely placed sources, T −1/2 Mθ = (R ⊗ R) A˙ dP , and analyze the asymptotic efficiency of both estimators from various aspects. Finally, we investigate how the number of T −1/2  Ms = (R ⊗ R) Ad i . sensors affect the asymptotic MSE. We can write the partitioned FIM as In all experiments, we define the signal-to-noise ratio (SNR) as  H H  min p Mθ Mθ Mθ Ms k=1,2,...,K k FIM = N H H . SNR = 10 log10 2 , Ms Mθ Ms Ms σn The CRB matrix for the DOAs is then obtained by block-wise where K is the number of sources. inversion: Throughout Section V-A, V-B and V-C, we consider the fol- 1 H ⊥ −1 lowing three different types of linear arrays with the following CRBθ = (Mθ ΠM Mθ) , (19) N s sensor configurations: where Π⊥ = I − M (M H M )−1M H . It is worth noting Ms s s s s • Co-prime Array [5]: [0, 3, 5, 6, 9, 10, 12, 15, 20, 25]λ/2 that, unlike the classical CRB for the unconditional model • Nested Array [9]: [1, 2, 3, 4, 5, 10, 15, 20, 25, 30]λ/2 introduced in [15, Remark 1], expression (19) is applicable • MRA [27]: [0, 1, 4, 10, 16, 22, 28, 30, 33, 35]λ/2 even if the number of sources exceeds the number of sensors. All three arrays share the same number of sensors, but Remark 1. Similar to (11), CRBθ depends on the SNRs difference apertures. 2 2 instead of the absolute values of pk or σn. Let p¯k = pk/σn, ¯ and P = diag(¯p1, p¯2,..., p¯K ). We have A. Numerical Verification ¯ T ¯ −1/2 ˙ ¯ Mθ = (R ⊗ R) AdP , (20) We first verify (11) via numerical simulations. We consider −2 ¯ T ¯ −1/2  11 sources with equal power, evenly placed between −67.50◦ Ms = σn (R ⊗ R) Ad i . (21) and 56.25◦, which is more than the number of sensors. We 2 Substituting (20) and (21) into (19), the term σn gets canceled, compare the difference between the analytical MSE and the and the resulting CRBθ depends on the ratios p¯k instead of empirical MSE under different combinations of SNR and 2 absolute values of pk or σn. snapshot numbers. The analytical MSE is defined by

Remark 2. The invertibility of the FIM depends on the coarray K T −1 1 X structure. In the noisy case, (R ⊗R) is always full rank, so MSE = (θ ), an K k the FIM is invertible if and only if ∂r/∂η is full column rank. k=1 6 and the empirical MSE is defined by of resolution is computed from 500 trials. For all trials, the L K number of snapshots is fixed at 500, the SNR is set to 0dB, 1 X X (l) (l) 2 MSE = θˆ − θ  , and SS-MUSIC is used. em KL k k l=1 k=1 For illustration purpose, we analytically predict the resolv- ability of the two sources via the following simple criterion: (l) ˆ(l) where θk is the k-th DOA in the l-th trial, and θk is the corresponding estimate. Unresovalble (θ1) + (θ2) R ∆θ. (22) Resolvable Readers are directed to [28] for a more comprehensive crite- rion. Fig. 3 illustrates the resolution performance of the three arrays under different ∆θ, as well as the thresholds predicted by (22). The MRA shows best resolution performance of the three arrays, which can be explained by the fact that the MRA has the largest aperture. The co-prime array, with the smallest aperture, shows the worst resolution performance. Despite the differences in resolution performance, the probability of resolution of each array drops to nearly zero at the predicted thresholds. This confirms that (11) provides a convenient way of predicting the resolvability of two close sources.

1

0.8

0.6

Co-prime Array 0.4 Co-prime Array Predicted Nested Array 0.2 Nested Array Predicted

Probability of resolution MRA MRA Predicted 0 0.5 1 1.5 2 2.5 3 Fig. 2. |MSEan − MSEem|/MSEem for different types of arrays under ∆θ (degree) different numbers of snapshots and different SNRs. Fig. 3. Probability of resolution vs. source separation, obtained from 500 trials. The number of snapshots is fixed at 500, and the SNR is set to 0dB. Fig. 2 illustrates the relative errors between MSEan and MSEem obtained from 10,000 trials under various scenarios. It can be observed that MSEem and MSEan agree very well given enough snapshots and a sufficiently high SNR. It should C. Asymptotic Efficiency Study be noted that at 0dB SNR, (8) is quite accurate when 250 In this section, we utilize (11) and (19) to study the asymp- snapshots are available. In addition. there is no significant totic statistical efficiency of DA-MUSIC and SS-MUSIC under difference between the relative errors obtained from DA- different array geometries and parameter settings. We define MUSIC and those from SS-MUSIC. These observations are their average efficiency as consistent with our assumptions, and verify Theorem 1 and tr CRB Theorem 2. κ = θ . (23) PK We observe that in some of the low SNR regions, |MSEan − k=1 (θk) MSE |/MSE appears to be smaller even if the number em em For efficient estimators we expect κ = 1, while for inefficient of snapshots is limited. In such regions, MSE actually em estimators we expect 0 ≤ κ < 1. “saturates”, and MSE happens to be close to the saturated an We first compare the κ value under different SNRs for value. Therefore, this observation does not imply that (11) is the three different arrays. We consider three cases: K = 1, valid in such regions. K = 6, and K = 12. The K sources are located at {−60◦ + [120(k − 1)/(K − 1)]◦|k = 1, 2,...,K}, and all B. Prediction of Resolvability sources have the same power. As shown in Fig. 4(a), when One direct application of Theorem 2 is predicting the only one source is present, κ increases as the SNR increases resolvability of two closely located sources. We consider two for all three arrays. However, none of the arrays leads to ◦ sources with equal power, located at θ1 = 30 − ∆θ/2, and efficient DOA estimation. Interestingly, despite being the least ◦ ◦ ◦ θ2 = 30 +∆θ/2, where ∆θ varies from 0.3 to 3.0 . We say efficient geometry in the low SNR region, the co-prime array the two sources are correctly resolved if the MUSIC algorithm achieves higher efficiency than the nested array in the high is able to identify two sources, and the two estimated DOAs SNR region. When K = 6, we can observe in Fig. 4(b) that ˆ satisfy |θi − θi| < ∆θ/2, for i ∈ {1, 2}. The probability κ decreases to zero as SNR increases. This rather surprising 7 behavior suggests that both DA-MUSIC and SS-MUSIC are results obtained from 1000 trials. To satisfy the asymptotic not statistically efficient methods for DOA estimation when assumption, the number of snapshots is fixed at 1000 for the number of sources is greater than one and less than the each trial. As shown in Fig. 5(a)–5(c), the overall statistical number of sensors. It is consistent with the implication of efficiency decreases as the SNR increases from 0dB to 10dB Proposition 2 when K < M. When K = 12, the number for all three arrays, which is consistent with our previous of sources exceeds the number of sensors. We can observe observation in Fig. 4(b). We can also observe that the re- in Fig. 4c that κ also decreases as SNR increases. However, lationship between κ and the normalized angular separation unlike the case when K = 6, κ converges to a positive value ∆θ/π is rather complex, as opposed to the traditional MUSIC instead of zero. algorithm (c.f. [2]). The statistical efficiency of DA-MUSIC The above observations imply that DA-MUSIC and SS- and SS-MUSIC is highly dependent on array geometry and MUSIC achieve higher degrees of freedom at the cost of angular separation. decreased statistical efficiency. When statistical efficiency is concerned and the number of sources is less than the number (a) 1 of sensors, one might consider applying MUSIC directly to SNR = 0dB R SNR = 10dB the original sample covariance defined in (3) [29]. 0.8

(a) 0.6

1 κ 0.4

0.8 0.2 0.6

κ 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.4 ∆θ/π

0.2 Co-prime Array (b) Nested Array MRA 1 0 SNR = 0dB -10 -5 0 5 10 15 20 SNR = 10dB SNR 0.8

(b) 0.6 1 κ 0.4 0.8

0.2 0.6

κ 0 0.4 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 ∆θ/π

0.2 Co-prime Array (c) Nested Array MRA 1 0 SNR = 0dB -10 -5 0 5 10 15 20 SNR = 10dB SNR 0.8 (c) 0.6 1 κ 0.4 0.8

0.2 0.6 κ 0 0.4 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 ∆θ/π

0.2 Co-prime Array Fig. 5. Average efficiency vs. angular separation for the co-prime array: (a) Nested Array MRA, (b) nested array, (c) co-prime array. The solid lines and dashed lines MRA 0 are analytical values obtained from (23). The circles and crosses are emprical -10 -5 0 5 10 15 20 results averaged from 1000 trials. SNR Fig. 4. Average efficiency vs. SNR: (a) K = 1, (b) K = 6, (c) K = 12.

Next, we then analyze how κ is affected by angular sepa- D. MSE vs. Number of Sensors ration. Two sources located at −∆θ and ∆θ are considered. In this section, we investigate how the number of sensors We compute the κ values under different choices of ∆θ for affect the asymptotic MSE, (θk). We consider three types all three arrays. For reference, we also include the empirical of sparse linear arrays: co-prime arrays, nested arrays, and 8

MRAs. In this experiment, the co-prime arrays are generated Additionally, we will further investigate how the number of by co-prime pairs (q, q + 1) for q = 2, 3,..., 12. The nested sensors affect the MSE and the CRB for sparse linear arrays, arrays are generated by parameter pairs (q + 1, q) for q = as well as the possibility of deriving closed form expressions 2, 3,..., 12. The MRAs are constructed according to [27]. We in the case of large number of sensors. consider two cases: the one source case where K = 1, and the under determined case where K = M. For the former APPENDIX A ◦ case, we placed the only source at the 0 . For the later case, DEFINITION AND PROPERTIES OF THE COARRAY ◦ ◦ we placed the sources uniformly between −60 and 60 . We SELECTION MATRIX set SNR = 0dB and N = 1000. The empirical MSEs were According to (3), obtained from 500 trials. SS-MUSIC was used in all the trials. K X ¯ ¯ 2 Rmn = pk exp[j(dm − dn)φk] + δmnσn, (a) (b) k=1

where δmn denotes Kronecker’s delta. This equation implies Co-prime Nested that the (m, n)-th element of R is associated with the dif- MRA ¯ ¯ O M −4.5 (d − d ) 10-2 ( ) 10-2 ference m n . To capture this property, we introduce the ) ) 2 2 ¯ ¯ difference matrix ∆ such that ∆mn = dm −dn. We also define the weight function ω(n): Z 7→ Z as (see [9] for details) -4 -4 MSE (deg 10 MSE (deg 10 Co-prime ω(l) = |{(m, n)|∆ = l}|, Nested mn MRA O(M −3.5) where |A| denotes the cardinality of the set A. Intuitively, ω(l) 10-6 10-6 ¯ ¯ 4 14 50 4 14 50 counts the number of all possible pairs of (dm, dn) such that M M ¯ ¯ dm − dn = l. Clearly, ω(l) = ω(−l). Fig. 6. MSE vs. M: (a) K = 1, (b) K = M. The solid lines are analytical results. The “+”, “◦”, and “” denote empirical results obtains from 500 Definition 1. The coarray selection matrix F is a (2Mv − trials. The dashed lines are trend lines used for comparison. 1) × M 2 matrix satisfying

( 1 , ∆pq = m − Mv, In Fig. 6(a), we observe that when K = 1, the MSE ω(m−Mv) Fm,p+(q−1)M = (24) decreases at a rate of approximately O(M −4.5) for all three 0 , otherwise, arrays. In Fig. 6(b), we observe that when K = M, the for m = 1, 2,..., 2M − 1, p = 1, 2, . . . , M, q = 1, 2,...,M. MSE only decreases at a rate of approximately O(M −3.5). v In both cases, the MRAs and the nested arrays achieve lower To better illustrate the construction of F , we consider a toy MSE than the co-prime arrays. Another interesting observation array whose sensor locations are given by {0, d0, 4d0}. The is that for all three arrays, the MSE decreases faster than corresponding difference matrix of this array is O(M −3). Recall that for a M-sensor ULA, the asymptotic 0 −1 −4 MSE of traditional MUSIC decreases at a rate of O(M −3) ∆ = 1 0 −3 . as M → ∞ [2]. This observation suggests that given the   4 3 0 same number of sensors, these sparse linear arrays can achieve higher estimation accuracy than ULAs when the number of The ULA part of the difference coarray consists of three sensors is large. sensors located at −d0, 0, and d0. The weight function satisfies ω(−1) = ω(1) = 1, and ω(0) = 3, so Mv = 2. We can write VI.CONCLUSION the coarray selection matrix as In this paper, we reviewed the coarray signal model and 0 0 0 1 0 0 0 0 0 derived the asymptotic MSE expression for two coarray-based 1 1 1 F =  3 0 0 0 3 0 0 0 3  . MUSIC algorithms, namely DA-MUSIC and SS-MUSIC. We 0 1 0 0 0 0 0 0 0 theoretically proved that the two MUSIC algorithms share the If we pre-multiply the vectorized sample covariance matrix r same asymptotic MSE error expression. Our analytical MSE by F , we obtain the observation vector of the virtual ULA expression is more revealing and can be applied to various (defined in (5)): types of sparse linear arrays, such as co-prime arrays, nested     arrays, and MRAs. In addition, our MSE expression is also z1 R12 1 valid when the number of sources exceeds the number of z = z2 =  3 (R11 + R22 + R33) . sensors. We also derived the CRB for sparse linear arrays, z3 R21 and analyzed the statistically efficiency of typical sparse linear It can be seen that z is obtained by averaging all the elements arrays. Our results will benefit to future research on perfor- m in R that correspond to the difference m − M , for m = mance analysis and optimal design of sparse linear arrays. v 1, 2,..., 2M − 1. Throughout our derivations, we assume the array is perfectly v Based on Definition 1, we now derive several useful prop- calibrated. In the future, it will be interesting to extend the erties of F . results in this paper to cases when model errors are present. 9

Lemma 1. Fm,p+(q−1)M = F2Mv−m,q+(p−1)M for m = If the perturbation is small, we can omit high-order terms and 1, 2,..., 2Mv − 1, p = 1, 2, . . . , M, q = 1, 2,...,M. obtain [16], [23], [25] H . −1 † Proof: If Fm,p+(q−1)M = 0, then ∆pq 6= m − Mv. Av ∆En1 = −P Av∆Rv1En. (27) Because ∆qp = −∆pq, ∆qp 6= −(m − Mv). Hence (2Mv − Because P is diagonal, for a specific θk, we have m) − Mv = −(m − Mv) 6= ∆qp, which implies that H . −1 T † F2Mv−m,q+(p−1)M is also zero. a (θk)∆En1 = −pk ek Av∆Rv1En, (28) If F 6= 0, then ∆ = m − M and m,p+(q−1)M pq v where e is the k-th column of the identity matrix I . F = 1/ω(m − M ). Note that (2M − m) − k K×K m,p+(q−1)M v v Based on the conclusion in Appendix B of [2], under suffi- M = −(m − M ) = −∆ = ∆ . We thus have v v pq qp ciently small perturbations, the error expression of DA-MUSIC F = 1/ω(−(m − M )) = 1/ω(m − M ) = 2Mv−m,q+(p−1)M v v for the k-th DOA is given by Fm,p+(q−1)M . H H (1) . R[a (θk)∆En1E a˙ v(θk))] Lemma 2. Let R ∈ M be Hermitian symmetric. Then z = θˆ − θ = − v n , (29) C k k a˙ H (θ )E EH a˙ (θ ) F vec(R) is conjugate symmetric. v k n n v k H where a˙ v(θk) = ∂av(θk)/∂θk. Proof: By Lemma 1 and R = R , Substituting (28) into (29) gives M M T † H X X (1) . R[e A ∆Rv1EnE a˙ v(θk)] z = F R θˆ − θ = − k v n . (30) m m,p+(q−1)M pq k k p a˙ H (θ )E EH a˙ (θ ) p=1 q=1 k v k n n v k M M T H Because vec(AXB) = (B ⊗ A) vec(X) and EnEn = X X ∗ ⊥ = F2M −m,q+(p−1)M R Π , we can use the notations introduced in (9b)–(9d) to v qp Av q=1 p=1 express (30) as = z∗ . 2Mv−m ˆ(1) . −1 T θk − θk = −(γkpk) R[(βk ⊗ αk) ∆rv1], (31)

where ∆rv1 = vec(∆Rv1). 2Mv−1 Lemma 3. Let z ∈ C be conjugate symmetric. Then Note that R˜v1 is constructed from R˜. It follows that ∆Rv1 T matM,M (F z) is Hermitian symmetric. actually depends on ∆R, which is the perturbation part of the T covariance matrix R. By the definition of Rv1, Proof: Let H = matM,M (F z). Then   2M −1 ∆rv1 = vec( ΓMv ∆z ··· Γ2∆z Γ1∆z ) = ΓF ∆r, Xv Hpq = zmFm,p+(q−1)M . (25) where Γ = [ΓT ΓT ··· ΓT ]T and ∆r = vec(∆R). Mv Mv−1 1 m=1 T T Let ξk = F Γ (βk ⊗ αk). We can now express (31) in We know that z is conjugate symmetric, so z = z∗ . m 2Mv−m terms of ∆r as Therefore, by Lemma 1 ˆ(1) . −1 T θk − θk = −(γkpk) R(ξk ∆r), (32) 2M −1 Xv H = z∗ F which completes the first part of the proof. pq 2Mv−m 2Mv−m,q+(p−1)M m=1 We next consider the first-order error expression of SS- ∗ " 2Mv−1 # MUSIC. From (7) we know that Rv2 shares the same eigen- X (26) = zm0 Fm0,q+(p−1)M vectors as Rv1. Hence the eigendecomposition of Rv2 can be m0=1 expressed by ∗ = Hqp. H H Rv2 = EsΛs2Es + EnΛn2En ,

where Λs2 and Λn2 are the eigenvalues of the signal subspace 4 and noise subspace. Specifically, we have Λn2 = σn/MvI. APPENDIX B H 4 2 Note that Rv2 = (AvPAv +σnI) /Mv. Following a similar PROOFOF THEOREM 1 approach to the one we used to obtain (27), we get We first derive the first-order expression of DA-MUSIC. H . −1 H 2 −1 † Av ∆En2 = −MvP (PAv Av + 2σnI) Av∆Rv2En, Denote the eigendecomposition of Rv1 by H H where ∆En2 is the perturbation of the noise eigenvectors Rv1 = EsΛs1E + EnΛn1E , s n produced by ∆Rv2. After omitting high-order terms, ∆Rv2 is given by where En and Es are eigenvectors of the signal subspace and noise subspace, respectively, and Λs1, Λn1 are the correspond- Mv 2 . 1 X H H ing eigenvalues. Specifically, we have Λn1 = σnI. ∆Rv2 = (zk∆zk + ∆zkzk ). Mv Let R˜v1 = Rv1 + ∆Rv1, E˜n1 = En + ∆En1, and Λ˜ n1 = k=1 Λn1 + ∆Λn1 be the perturbed versions of Rv1, En, and Λn1. According to [9], each subarray observation vector zk can be The following equality holds: expressed by

Mv−k 2 (Rv1 + ∆Rv1)(En + ∆En1) = (En + ∆En1)(Λn1 + ∆Λn1). zk = AvΨ p + σniMv−k+1, (33) 10

for k = 1, 2,...,Mv, where il is a vector of length Mv whose We now start deriving the explicit MSE expression. Accord- elements are zero except for the l-th element being one, and ing to (32), Ψ = diag(e−jφ1 , e−jφ2 , . . . , e−jφK ). ˆ ˆ E[(θk1 − θk1 )(θk2 − θk2 )] . Observe that =(γ p )−1(γ p )−1 [R(ξT ∆r) R(ξT ∆r)] k1 k1 k2 k2 E k1 k2 Mv −1 −1 T T X =(γk1 pk1 ) (γk2 pk2 ) R(ξk1 ) E[R(∆r) R(∆r) ] R(ξk2 ) σ2i ∆zH = σ2∆RH , n Mv−k+1 k n v1 + I(ξ )T [I(∆r) I(∆r)T ] I(ξ ) k=1 k1 E k2 T T and − R(ξk1 ) E[R(∆r) I(∆r) ] I(ξk2 ) T T M v − R(ξk2 ) E[R(∆r) I(∆r) ] I(ξk1 ) , X Mv−k H AvΨ p∆zk (35) k=1 where we used the property that R(AB) = R(A) R(B) −  −j(Mv−1)φ1 −j(Mv−2)φ1   H  I(A) I(B) for two complex matrices A and B with proper e e ··· 1 ∆z1 −j(Mv−1)φ2 −j(Mv−2)φ2 H dimensions.  e e ··· 1  ∆z2  =A P     v  . . .. .  .  To obtain the closed-form expression for (35), we need  . . . .  .  to compute the four expectations. It should be noted that in −j(Mv−1)φK −j(Mv−2)φK ∆zH e e ··· 1 Mv the case of finite snapshots, ∆r does not follow a circularly- H H =AvP (TMv Av) TMv ∆Rv1 symmetric complex Gaussian distribution. Therefore we can- H H not directly use the properties of the circularly-symmetric =AvPA ∆R , v v1 complex Gaussian distribution to evaluate the expectations. where TMv is a Mv × Mv permutation matrix whose anti- For brevity, we demonstrate the computation of only the first diagonal elements are one, and whose remaining elements are expectation in (35). The computation of the remaining three H zero. Because ∆R = ∆R , by Lemma 2 we know that ∆z expectations follows the same idea. is conjugate symmetric. According to the definition of Rv1, Let ri denote the i-th column of R in (3). Its estimate, rˆi, H N it is straightforward to show that ∆Rv1 = ∆Rv1 also holds. P ∗ is given by y(t)yi (t), where yi(t) is the i-th element Hence t=1 of y(t). Because E[rˆi] = ri, . 1 H 2 H ∆Rv2 = [(AvPAv + 2σnI)∆Rv1 + ∆Rv1AvPAv ]. T Mv E[R(∆ri) R(∆rl) ] (36) H = [R(rˆ ) R(rˆ )T ] − R(r ) R(r )T . Substituting ∆Rv2 into the expression of Av ∆En2, and E i l i l utilizing the property that AH E = 0, we can express v n The second term in (36) is deterministic, and the first term in AH ∆E as v n2 (36) can be expanded into −1 H 2 −1 † H 2 −P (PAv Av + 2σnI) Av(AvPAv + 2σnI)∆Rv1En. " N N # 1  X   X T Observe that R y(s)y∗(s) R y(t)y∗(t) N 2 E i l † H 2 H −1 H H 2 s=1 t=1 Av(AvPAv + 2σnI) =(Av Av) Av (AvPAv + 2σnI) " N N # H 2 H −1 H 1 =[PAv + 2σn(Av Av) Av ] X X ∗ ∗ T = E R(y(s)yi (s)) R(y(t)yl (t)) H 2 † N 2 =(PAv Av + 2σnI)Av. s=1 t=1 N N H 2 1 X X n Hence the term (PAv Av + 2σnI) gets canceled and we =  R(y(s)) R(y∗(s)) − I(y(s)) I(y∗(s)) N 2 E i i obtain s=1 t=1 H . −1 † Av ∆En2 = −P Av∆Rv1En, (34)  T ∗ T ∗ o R(y(t)) R(yl (t)) − I(y(t)) I(yl (t)) which coincides with the first-order error expression of N N AH ∆E . 1 X X n v n1 = [R(y(s)) R(y (s)) R(y(t))T R(y (t))] N 2 E i l s=1 t=1 APPENDIX C + [R(y(s)) R(y (s)) I(y(t))T I(y (t))] PROOFOF THEOREM 2 E i l + [I(y(s)) I(y (s)) R(y(t))T R(y (t))] Before proceeding to the main proof, we introduce the E i l T o following definition. + E[I(y(s)) I(yi(s)) I(y(t)) I(yl(t))] . (37) N×N Definition 2. Let A = [a1 a2 ... aN ] ∈ R , and N×N We first consider the partial sum of the cases when s 6= t. B = [b1 b2 ... bN ] ∈ R . The structured matrix CAB ∈ 2 2 By A4, y(s) and y(t) are uncorrelated Gaussians. Recall that RN ×N is defined as for x ∼ CN (0, Σ),  T T T  a1b1 a2b1 ... aN b1 T T T a b a b ... a b T 1 T 1  1 2 2 2 N 2  E[R(x) R(x) ] = R(Σ), E[R(x) I(x) ] = − I(Σ) CAB =  . . . .  . 2 2  . .. . .    T 1 T 1 T T T E[I(x) R(x) ] = I(Σ), E[I(x) I(x) ] = R(Σ). a1bN a2bN ... aN bN 2 2 11

We have which completes the computation of first expectation in (35). T Utilizing the same technique, we obtain that E[R(y(s)) R(yi(s)) R(y(t)) R(yl(t))] T T =E[R(y(s)) R(yi(s))]E[R(y(t)) R(yl(t))] E[I(∆r) I(∆r) ] 1 1 = R(r ) R(r )T . = [R(R) ⊗ R(R) + I(R) ⊗ I(R) (41) 4 i l 2N Similarly, we can obtain that when s 6= t, + CI(R) I(R) − CR(R) R(R)], and T 1 T E[R(y(s)) R(yi(s)) I(y(t)) I(yl(t))] = R(ri) R(rl) , T 4 E[R(∆r) I(∆r) ] T 1 T 1 E[I(y(s)) I(yi(s)) R(y(t)) R(yl(t))] = R(ri) R(rl) , = [I(R) ⊗ R(R) − R(R) ⊗ I(R) (42) 4 2N T 1 T [I(y(s)) I(y (s)) I(y(t)) I(y (t))] = R(r ) R(r ) . + CR(R) I(R) + CI(R) R(R)]. E i l 4 i l (38) Substituting (40)–(42) into (35) gives a closed-form MSE Therefore the partial sum of the cases when s 6= t is given by expression. However, this expression is too complicated for T (1 − 1/N) R(ri) R(rl) . analytical study. In the following steps, we make use of the We now consider the partial sum of the cases when s = t. properties of ξk to simply the MSE expression. We first consider the first expectation inside the double sum- Lemma 4. Let X, Y , A, B ∈ RN×N satisfying XT = mation in (37). Recall that for x ∼ N (0, Σ), E[xixlxpxq] = (−1)nx X, AT = (−1)na A, and BT = (−1)nb B, where σilσpq +σipσlq +σiqσlp. We can express the (m, n)-th element T nx, na, nb ∈ {0, 1}. Then of the matrix E[R(y(t)) R(yi(t)) R(y(t)) R(yl(t))] as T nx+nb T vec(X) (A⊗B) vec(Y ) = (−1) vec(X) CAB vec(Y ), E[R(ym(t)) R(yi(t)) R(yn(t)) R(yl(t))] vec(X)T (B⊗A) vec(Y ) = (−1)nx+na vec(X)T C vec(Y ). =E[R(ym(t)) R(yi(t)) R(yl(t)) R(yn(t))] BA =E[R(ym(t)) R(yi(t))]E[R(yl(t)) R(yn(t))] Proof: By Definition 2, + [R(ym(t)) R(yl(t))] [R(yi(t)) R(yn(t))] T E E vec(X) CAB vec(Y ) + E[R(ym(t)) R(yn(t))]E[R(yi(t)) R(yl(t))] N N X X T T 1 = xmanbmyn = [R(Rmi) R(Rln) + R(Rml) R(Rin) + R(Rmn) R(Ril)]. 4 m=1 n=1 N N N N Hence X X  X  X  = A X B Y [R(y(t)) R(y (t)) R(y(t))T R(y (t))] pn pm qm qn E i l m=1 n=1 p=1 p=1 1 T T N N N N = [R(ri) R(rl) + R(rl) R(ri) + R(R) R(Ril)]. X X X X 4 = ApnXpmBqmYqn Similarly, we obtain that m=1 n=1 p=1 q=1 T N N N N E[I(y(t)) I(yi(t)) I(y(t)) I(yl(t))] nx+nb X X X X =(−1) (XmpBmqYqn)Apn 1 T T = [R(ri) R(rl) + R(rl) R(ri) + R(R) R(Ril)], p=1 n=1 m=1 q=1 4 N N T nx+nb X X T E[R(y(t)) R(yi(t)) I(y(t)) I(yl(t))] =(−1) xp ApnByn T p=1 n=1 =E[I(y(t)) I(yi(t)) R(y(t)) R(yl(t))] =(−1)nx+nb vec(X)T (A ⊗ B) vec(Y ). 1 T T = [R(ri) R(rl) − I(rl) I(ri) + I(R) I(Ril)]. 4 The proof of the second equality follows the same idea. Therefore the partial sum of the cases when s = t is given by ⊥ ⊥ ∗ T Lemma 5. TMv ΠA TMv = (ΠA ) . (1/N) R(ri) R(rl) +(1/2N)[R(R) R(Ril)+I(R) I(Ril)+ v v R(r ) R(r )T − I(r ) I(r )T ] . Combined with the previous Proof: Since Π⊥ = I − A (AH A )−1AH , l i l i Av v v v v H −1 H partial sum of the cases when s 6= t, we obtain that it suffices to show that TMv Av(Av Av) Av TMv = H −1 H ∗ T (Av(Av Av) Av ) . Because Av is the steering ma- E[R(∆ri) R(∆rl) ] trix of a ULA with Mv sensors, it is straightfor- 1 ∗ = [R(R) R(Ril) + I(R) I(Ril) (39) ward to show that TMv Av = (AvΦ) , where Φ = 2N −j(Mv−1)φ1 −j(Mv−1)φ2 −j(Mv−1)φK T T diag(e , e , . . . , e ). + R(rl) R(ri) − I(rl) I(ri) ]. Because T T = I, T H = T , Mv Mv Mv Mv Therefore T A (AH A )−1AH T T Mv v v v v Mv E[R(∆r) R(∆r) ] H H −1 H H =TM Av(A T TM Av) A T 1 v v Mv v v Mv = [R(R) ⊗ R(R) + I(R) ⊗ I(R) (40) ∗ T ∗ −1 T 2N =(AvΦ) ((AvΦ) (AvΦ) ) (AvΦ) H −1 H ∗ + CR(R) R(R) − CI(R) I(R)], =(Av(Av Av) Av ) . 12

where we make use of the properties that RT = R∗, and R(R∗ ⊗ R) = R(R) ⊗ R(R) + I(R) ⊗ I(R). Similarly, we Lemma 6. Let Ξ = mat (ξ ). Then ΞH = Ξ for k = k M,M k k k can obtain that 1, 2,...,K. I(ξ )T [I(∆r) I(∆r)T ] I(ξ ) T T k1 E k2 Proof: Note that ξk = F Γ (βk ⊗ αk). We first (47) 1 T T prove that βk ⊗ αk is conjugate symmetric, or that (TMv ⊗ = I(ξk1 ) R(R ⊗ R) I(ξk2 ), ∗ N TMv )(βk ⊗ αk) = (βk ⊗ αk) . Similar to the proof of ∗ T T R(ξk1 ) [R(∆r) I(∆r) ] I(ξk2 ) Lemma 5, we utilize the properties that TMv Av = (AvΦ) E −j(Mv−1)φk ∗ 1 (48) and that TMv av(θk) = (av(θk)e ) to show that T T = − R(ξk1 ) I(R ⊗ R) I(ξk2 ), † H H † H H ∗ N TMv (Av) ekav (θk)TMv = [(Av) ekav (θk)] . (43) R(ξ )T [R(∆r) I(∆r)T ] I(ξ ) ˙ ˙ k2 E k1 Observe that a˙ v(θk) = jφkDav(θk), where φk = (49) 1 T T (2πd cos θ )/λ and D = diag(0, 1,...,M − 1). We have = − R(ξk ) I(R ⊗ R) I(ξk ). 0 k v N 2 1 ∗ (TMv ⊗ TMv )(βk ⊗ αk) = (βk ⊗ αk) Substituting (46)–(49) into (35) completes the proof. T T ∗ ⇐⇒ TMv αkβk TMv = (αkβk ) † H H ⊥ ∗ APPENDIX D ⇐⇒ TMv [(Av) ekav (θk)DΠA ] TMv v PROOFOF PROPOSITION 2 = −(A† )H e aH (θ )DΠ⊥ . v k v k Av 2 Without loss of generality, let p = 1 and σn → 0. For T Since D = TMv TMv DTMv TMv , combining with Lemma 5 brevity, we denote R ⊗ R by W . We first consider the case and (43), it suffices to show that when K < M. Denote the eigendecomposition of R−1 by † H H ⊥ E Λ−1EH + σ−2E EH . We have (A ) eka (θk)TM DTM Π s s s n n n v v v v Av (44) = −(A† )H e aH (θ )DΠ⊥ . W −1 = σ−4K + σ−2K + K , v k v k Av n 1 n 2 3

Observe that TMv DTMv + D = (Mv − 1)I. We have where Π⊥ (T DT + D)a (θ ) = 0, ∗ T H Av Mv Mv v k K1 = EnEn ⊗ EnEn , ∗ −1 T H ∗ T −1 H or equivalently K2 = Es Λs Es ⊗ EnEn + EnEn ⊗ EsΛs Es , ∗ −1 T −1 H aH (θ )T DT Π⊥ = −aH (θ )DΠ⊥ . K3 = Es Λs Es ⊗ EsΛs Es . v k Mv Mv Av v k Av (45) H † H Recall that A En = 0. We have Pre-multiplying both sides of (45) with (Av) ek leads to (44), which completes the proof that βk ⊗ αk is conjugate ˙ ∗ T H ˙ ∗ ∗ ˙ K1Ad = (EnEn ⊗ EnEn )(A A + A A) symmetric. According to the definition of Γ in (9e), it is ∗ T ˙ ∗ H ∗ T ∗ H ˙ T = EnEn A EnEn A + EnEn A EnEn A straightforward to show that Γ (βk ⊗ αk) is also conjugate symmetric. Combined with Lemma 3 in Appendix A, we con- = 0. (50) T T clude that matM,M (F Γ (βk⊗αk)) is Hermitian symmetric, Therefore H or that Ξk = Ξ . k H ˙ H −1 ˙ −2 ˙ H 2 ˙ Given Lemma 4–6, we are able continue Mθ Mθ = Ad W Ad = σn Ad (K2 + σnK3)Ad. (51) the simplification. We first consider the term −1 − 1 −2 −1 Similar to W , we denote W 2 = σ K1 +σ K4 +K5, R(ξ )T [R(∆r) R(∆r)T ] R(ξ ) in (35). Let n n k1 E k2 where Ξ = mat (ξ ), and Ξ = mat (ξ ). By k1 M,M k1 k2 M,M k2 − 1 − 1 H H ∗ 2 T H ∗ T 2 H Lemma 6, we have Ξ = Ξ , and Ξ = Ξ . Observe K4 = E Λs E ⊗ EnE + E E ⊗ EsΛs E , k1 k1 k2 k2 s s n n n s T T 1 1 that R(R) = R(R), and that I(R) = I(R). By Lemma 4 ∗ − 2 T − 2 H K = E Λs E ⊗ E Λs E . we immediately obtain the following equalities: 5 s s s s T Therefore R(ξk1 ) (R(R) ⊗ R(R)) R(ξk2 ) T H Mθ ΠMs Mθ = R(ξk1 ) CR(R) R(R) R(ξk2 ), 1 1 ˙ H − 2 − 2 ˙ T =Ad W ΠMs W Ad R(ξk1 ) (I(R) ⊗ I(R)) R(ξk2 ) −2 ˙ H ˙ T =σn Ad (σnK5 + K4)ΠMs (σnK5 + K4)Ad, = − R(ξk1 ) CI(R) I(R) R(ξk2 ). † T T where ΠMs = MsMs . We can then express the CRB as Therefore R(ξk1 ) E[R(∆r) R(∆r) ] R(ξk2 ) can be com- pactly expressed as 2 2 −1 CRBθ = σn(Q1 + σnQ2 + σnQ3) , (52) T T R(ξk1 ) E[R(∆r) R(∆r) ] R(ξk2 ) where 1 T H = R(ξk ) [R(R) ⊗ R(R) + I(R) ⊗ I(R)] R(ξk ) Q = A˙ (K − K Π K )A˙ , N 1 2 1 d 2 4 Ms 4 d 1 ˙ H ˙ T T Q2 = −Ad (K4ΠMs K5 + K5ΠMs K4)Ad, = R(ξk1 ) R(R ⊗ R) R(ξk2 ), N Q = A˙ H (K − K Π K )A˙ . (46) 3 d 3 5 Ms 5 d 13

2 H 2 When σn = 0, R reduces to AA . Observe that the Combining the above results, we obtain that when σn → 0, eigendecomposition of R always exists for σ2 ≥ 0. We use n K?iiH K? K?–K? to denote the corresponding K –K when σ2 → 0. Π → K?A AH K? + 1 1 . 1 5 1 5 n Ms 5 d d 5 M − K Lemma 7. Let K < M. Assume ∂r/∂η is full column rank. Then lim 2 + Π exists. 2 σn→0 Ms For sufficiently small σn > 0, it is easy to show that K1–K5 H are bounded in the sense of Frobenius norm (i.e., kKikF ≤ C Proof: Because A En = 0, for some C > 0, for i ∈ {1, 2, 3, 4, 5}). Because ∂r/∂η is ∗ −1 T H ∗ 2 K2Ad =(Es Λs Es ⊗ EnEn )(A A) full rank, Ms is also full rank for any σn > 0, which implies 2 ∗ T −1 H ∗ that ΠM is well-defined for any σ > 0. Observe that ΠM + (EnEn ⊗ EsΛs Es )(A A) s n s ∗ −1 T ∗ H is positive semidefinite, and that tr(ΠMs ) = rank(Ms). We =Es Λs Es A EnEn A 2 know that ΠMs is bounded for any σn > 0. Therefore Q2 and + E∗ET A∗ E Λ−1EH A 2 n n s s s Q3 are also bounded for sufficiently small σn, which implies 2 2 =0 that σnQ2 + σnQ3 → 0 as σn → 0. Q → Q? σ2 → 0 H H By Lemma 7, we know that 1 1 as n , where Similarly, we can show that K4Ad = 0, i K2i = i K4i = H Q? = A˙ H (K? − K?Π? K?)A˙ , 0, and i K1i = rank(En) = M − K. Hence 1 d 2 4 Ms 4 d  H H  ? and Π = lim 2 + Π as derived in Lemma 7. Assume H Ad K3Ad Ad K3i Ms σn→0 Ms Ms Ms = H H −1 . ? 2 i K3Ad i W i Q1 is nonsingular . By (52) we immediately obtain that 2 H CRBθ → 0 as σn → 0. Because ∂r/∂η is full column rank, Ms Ms is full rank and When K ≥ M, R is full rank regardless of the choice of positive definite. Therefore the Schur complements exist, and 2 T −1 H H σn. Hence (R ⊗R) is always full rank. Because ∂r/∂η is we can inverse Ms Ms block-wisely. Let V = Ad K3Ad H −1 full column rank, the FIM is positive definite, which implies and v = i W i. After tedious but straightforward compu- its Schur complements are also positive definite. Therefore tation, we obtain CRBθ is positive definite. −1 H ΠMs =K5AdS Ad K5 −1 −1 H H −2 REFERENCES − s K5AdV Ad K3ii (K5 + σn K1) −1 −2 H −1 H [1] R. Schmidt, “Multiple emitter location and signal parameter estimation,” − v (K5 + σn K1)ii K3AdS Ad K5 IEEE Trans. Antennas Propag., vol. 34, no. 3, pp. 276–280, Mar. 1986. + s−1(K + σ−2K )iiH (K + σ−2K ), [2] P. Stoica and A. Nehorai, “MUSIC, maximum likelihood, and Cramer- 5 n 1 5 n 1 Rao bound,” IEEE Trans. Acoust., Speech, Signal Process., vol. 37, no. 5, S s pp. 720–741, May 1989. where and are Schur complements given by [3] B. Ottersten, M. Viberg, P. Stoica, and A. Nehorai, “Exact and large −1 H H sample maximum likelihood techniques for parameter estimation and S = V − v Ad K3ii K3Ad, detection in array processing,” in Array Processing, T. S. Huang, H −1 H T. Kohonen, M. R. Schroeder, H. K. V. Lotsch, S. Haykin, J. Litva, and s = v − i K3AdV A K5i. d T. J. Shepherd, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, Observe that 1993, vol. 25, pp. 99–151. [4] A. Moffet, “Minimum-redundancy linear arrays,” IEEE Trans. Antennas H −1 −4 H Propag., vol. 16, no. 2, pp. 172–175, Mar. 1968. v = i W i = σn (M − K) + i K3i. [5] P. Pal and P. P. Vaidyanathan, “Coprime sampling and the MUSIC −1 −1 4 algorithm,” in 2011 IEEE Digital Signal Processing Workshop and IEEE We know that both v and s decrease at the rate of σn. Signal Processing Education Workshop (DSP/SPE), Jan. 2011, pp. 289– 2 As σn → 0, we have 294. [6] Z. Tan and A. Nehorai, “Sparse Estimation Using H ? S → Ad K3 Ad, Co-Prime Arrays with Off-Grid Targets,” IEEE Signal Process. Lett., −1 −2 vol. 21, no. 1, pp. 26–29, Jan. 2014. s (K5 + σn K1) → 0, [7] Z. Tan, Y. C. Eldar, and A. Nehorai, “Direction of arrival estimation v−1(K + σ−2K ) → 0, using co-prime arrays: A super resolution viewpoint,” IEEE Trans. 5 n 1 Signal Process., vol. 62, no. 21, pp. 5565–5576, Nov. 2014. ? H ? −1 −2 H −2 K1 ii K1 [8] S. Qin, Y. Zhang, and M. Amin, “Generalized coprime array configura- s (K5 + σn K1)ii (K5 + σn K1) → . tions for direction-of-arrival estimation,” IEEE Trans. Signal Process., M − K vol. 63, no. 6, pp. 1377–1390, Mar. 2015. H ? [9] P. Pal and P. Vaidyanathan, “Nested arrays: A novel approach to array We now show that Ad K3 Ad is nonsingular. Denote the H ? ? ? H processing with enhanced degrees of freedom,” IEEE Trans. Signal eigendecomposition of AA by Es Λs (Es ) . Recall that Process., vol. 58, no. 8, pp. 4167–4181, Aug. 2010. for matrices with proper dimensions, (A B)H (C D) = [10] K. Han and A. Nehorai, “Wideband gaussian source processing using (AH C) ◦ (BH D), where ◦ denotes the Hadamard product. a linear nested array,” IEEE Signal Process. Lett., vol. 20, no. 11, pp. H ? 1110–1113, Nov. 2013. We can expand Ad K3 Ad into [11] ——, “Nested vector-sensor array processing via tensor modeling,” IEEE Trans. Signal Process. H ? ? −1 ? H ∗ H ? ? −1 ? H , vol. 62, no. 10, pp. 2542–2553, May 2014. [A Es (Λs ) (Es ) A] ◦ [A Es (Λs ) (Es ) A]. [12] B. Friedlander, “The root-MUSIC algorithm for direction finding with interpolated arrays,” Signal Processing, vol. 30, no. 1, pp. 15–29, Jan. H ? ? −1 ? H ? ? H Note that AA Es (Λs ) (Es ) A = Es (Es ) A = A, 1993. and that A is full column rank when K < M. We thus 2 ? H ? ? −1 ? H H ? The condition when Q1 is singular is difficult to obtain analytically. In have A Es (Λs ) (Es ) A = I. Therefore Ad K3 Ad = I, numerical simulations, we have verified that it remains nonsingular for various which is nonsingular. parameter settings. 14

[13] M. Pesavento, A. Gershman, and M. Haardt, “Unitary root-MUSIC with a real-valued eigendecomposition: a theoretical and experimental performance study,” IEEE Trans. Signal Process., vol. 48, no. 5, pp. 1306–1314, May 2000. [14] P. Stoica and A. Nehorai, “MUSIC, maximum likelihood, and Cramer- Rao bound: further results and comparisons,” IEEE Trans. Acoust., Speech, Signal Process., vol. 38, no. 12, pp. 2140–2150, Dec. 1990. [15] ——, “Performance study of conditional and unconditional direction- of-arrival estimation,” IEEE Trans. Acoust., Speech, Signal Process., vol. 38, no. 10, pp. 1783–1795, Oct. 1990. [16] F. Li, H. Liu, and R. Vaccaro, “Performance analysis for DOA estimation algorithms: unification, simplification, and observations,” IEEE Trans. Aerosp. Electron. Syst., vol. 29, no. 4, pp. 1170–1184, Oct. 1993. [17] R. Roy and T. Kailath, “ESPRIT-estimation of signal parameters via rotational invariance techniques,” IEEE Trans. Acoust., Speech, Signal Process., vol. 37, no. 7, pp. 984–995, Jul. 1989. [18] A. Gorokhov, Y. Abramovich, and J. F. Bohme, “Unified analysis of DOA estimation algorithms for covariance matrix transforms,” Signal Processing, vol. 55, no. 1, pp. 107–115, Nov. 1996. [19] Y. Abramovich, N. Spencer, and A. Gorokhov, “Detection-estimation of more uncorrelated Gaussian sources than sensors in nonuniform linear antenna arrays .I. Fully augmentable arrays,” IEEE Trans. Signal Process., vol. 49, no. 5, pp. 959–971, May 2001. [20] C.-L. Liu and P. Vaidyanathan, “Remarks on the spatial smoothing step in coarray MUSIC,” IEEE Signal Process. Lett., vol. 22, no. 9, Sep. 2015. [21] A. Koochakzadeh and P. Pal, “Cramer´ Rao bounds for underdetermined source localization,” IEEE Signal Process. Lett., vol. 23, no. 7, pp. 919– 923, Jul. 2016. [22] C.-L. Liu and P. Vaidyanathan, “Cramer-Rao´ bounds for coprime and other sparse arrays, which find more sources than sensors,” Digital Signal Processing, 2016. [Online]. Available: http://www.sciencedirect. com/science/article/pii/S1051200416300264 [23] G. W. Stewart, “Error and perturbation bounds for subspaces associated with certain eigenvalue problems,” SIAM Review, vol. 15, no. 4, pp. 727–764, Oct. 1973. [24] M. Hawkes, A. Nehorai, and P. Stoica, “Performance breakdown of subspace-based methods: prediction and cure,” in 2001 IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP ’01), vol. 6, 2001, pp. 4005–4008 vol.6. [25] A. Swindlehurst and T. Kailath, “A performance analysis of subspace- based methods in the presence of model errors. I. The MUSIC algo- rithm,” IEEE Trans. Signal Process., vol. 40, no. 7, pp. 1758–1774, Jul. 1992. [26] S. M. Kay, Fundamentals of statistical signal processing, ser. Prentice Hall signal processing series. Englewood Cliffs, N.J: Prentice-Hall PTR, 1993. [27] M. Ishiguro, “Minimum redundancy linear arrays for a large number of antennas,” Radio Science, vol. 15, no. 6, pp. 1163–1170, Nov. 1980. [28] Z. Liu and A. Nehorai, “Statistical angular resolution limit for point sources,” IEEE Trans. Signal Process., vol. 55, no. 11, pp. 5521–5527, Nov. 2007. [29] P. P. Vaidyanathan and P. Pal, “Direct-MUSIC on sparse arrays,” in 2012 International Conference on Signal Processing and Communications (SPCOM), Jul. 2012, pp. 1–5.