Waveform Design for MIMO Radar with Matrix Completion

1 Waveform Design for MIMO Radar with Matrix Completion

Shunqiao Sun, Student Member, IEEE, and Athina P. Petropulu, Fellow, IEEE

Abstract

It was recently shown that MIMO radars with sparse sensing and matrix completion (MC) can significantly reduce the volume of data required for accurate target detection and estimation. Based on sparsely sampled target returns, forwarded by the receive antennas to a fusion center, a matrix, referred to as the data matrix, can be partially filled, and subsequently completed via MC techniques. The completed data matrix can then be used in standard array processing methods to estimate the target parameters. This paper studies the applicability of MC theory on the data matrix arising in colocated MIMO radars using uniform linear arrays. It is shown that the data matrix coherence, and consequently the performance of MC, is directly related to the transmit waveforms. Among orthogonal waveforms, the optimum choices are those for which, any snapshot across the transmit array has a flat spectrum. The problem of waveform design is formulated as an optimization problem on the complex Stiefel manifold, and is solved via the modified steepest descent method, or the modified Newton algorithm with nonmonotone line search. It is shown that, under the optimal waveforms, the coherence of the data matrix is asymptotically optimal with respect to the number of transmit and receive antennas.

Index Terms

MIMO radar, matrix completion, waveform design, complex Stiefel manifold

S. Sun and A. P. Petropulu are with the Department of Electrical and Computer Engineering, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA (E-mail: {shunq.sun, athinap}@rutgers.edu). This work was supported by the Ofﬁce of Naval Research under grant ONR-N-00014-12-1-0036 and NSF under grant ECCS- 1408437. Partial results of this paper have been published in 48th Annual Asilomar Conference on Signals, Systems, and Computers (Asilomar’14) [1] and IEEE Global Conference on Signal and Information Processing (GlobalSIP’14) [2].

January 29, 2015 DRAFT 2

I.INTRODUCTION

NLIKE traditional phased-array radars which transmit fully correlated signals through their trans- U mit antennas, multiple-input multiple-output (MIMO) radars [3] transmit mutually orthogonal signals. The orthogonality allows the receive antennas to separate the transmitted signals via matched filtering. The target parameters are obtained by processing the phase shifts of the received signals. MIMO radars offer a high degree of freedom [4] and consequently, enable improved resolution. Since the transmitted signals are orthogonal, the transmit beam is not focused on a particular direction [5], thus resulting in decrease of illumination power. Pulse compression techniques are typically used to strengthen the receive signal power. The transmit pulse can be coded or modulated, e.g., phase-coded pulse or linear frequency modulated (LFM) pulse [6]. The idea of using low-rank matrix completion (MC) techniques in MIMO radars (termed as MIMO- MC radars) was first proposed in [7], [8] as means of reducing the volume of data required for accurate target detection and estimation. In particular, [8] considers a colocated pulse MIMO radar scenario with uniform linear arrays (ULAs) at the transmitter and the receiver. Each receive antenna samples the target returns and forwards the obtained samples to a fusion center. Based on the data received, the fusion center formulates a matrix, referred to as “data matrix”, which can then be used in standard array processing methods for target detection and estimation. If the data samples are obtained in a Nyquist fashion, and the number of targets is small relative to the number of transmit and receive antennas, the data matrix is low-rank [7]. Thus, it can be recovered from a small subset of its uniformly spaced elements via matrix completion (MC) techniques. By exploiting the latter fact, the MIMO-MC radar receive antennas obtain a small number of samples at uniformly random sampling times. Based on knowledge of the sampling instances, the fusion center populates the data matrix in a uniformly sparse fashion, and subsequently recovers the full matrix via MC techniques. This is referred to as Scheme II in [8]. Alternatively, the receive antennas can perform matched filtering with a randomly selected set of transmit waveforms, and forward the results to the fusion center, partially populating the data matrix. The data matrix is subsequently completed via MC. This is referred to as Scheme I in [8]. Both Schemes I and II reduce the number of samples that need to be forwarded to the fusion center. If the transmission occurs in a wireless fashion, this translates to savings in power and bandwidth. As compared to MIMO radars based on sparse signal recovery [9], [10], [11], MIMO-MC radars achieve similar performance but without requiring a target space grid. Details on the general topic of matrix completion and the conditions for matrix recovery can be

January 29, 2015 DRAFT 3 found in [12], [13], [14]. The conditions for the applicability of MC on Scheme I of [8] ULA can be found in [15]. In that case, and under ideal conditions, the transmit waveforms do not affect the MC performance. On the other hand, for Scheme II of [8], it was shown in [8], [16] that the transmit waveforms affect the matrix completion performance, as they directly affect the data matrix coherence [13]; a larger coherence implies that more samples need to be collected for reliable target estimation. That observation motivates the work in this paper, where we explore the relationship between matrix coherence and transmit waveforms for Scheme II of [8], and design waveforms that result in the lowest possible coherence. Waveform design in the context of MIMO radars has been extensively studied. For example, in [17], [18], the waveforms are designed to maximize the mutual information between target impulse response (which are assumed known) and reflected signals. In [19], waveforms with good correlation properties are designed under the unimodular constraint. In [20], clutter mitigation is considered in waveform design for ground moving-target indication (GMTI). Orthogonal phase-coded waveforms are designed in [21] via simulated annealing. Frequency hopping waveforms are designed in [22] to reduce the sidelobe of the radar ambiguity function [23]. For MIMO radars with sparse signal recovery, e.g., in [24], [25], the goal of waveform design is to reduce the coherence of sensing matrix. In this paper, we aim to design waveforms so that the coherence of the receive data matrix attains its lowest possible value. Since the waveforms are constrained to be orthogonal, we formulate the design problem as an optimization problem on the complex Stiefel manifold [26], and for its solution employ the modified steepest descent algorithm [27] and the modified Newton algorithm. The local gradient and Hessian of the cost functions are derived in closed form. Under the obtained optimized waveforms, numerical results show that, as the number of transmit antennas increases, the matrix coherence approaches its smallest possible value, i.e., 1. The rest of this paper is organized as follows. Some background on noisy matrix completion and MIMO-MC radars is provided in Section II. The matrix coherence analysis results are presented in Section III, while the waveform design is addressed in Section IV. Simulations results are given in Section V. Finally, Section VI provides some concluding remarks. Notation: We use lower-case and upper-case letters in bold denote vectors and matrices, respectively. See Table I for other notations used in the paper.

January 29, 2015 DRAFT 4

Table I

NOTATIONS

ℜ {·}: the real part of {·} ℑ {·}: the imaginary part of {·}

1L: the vector of length L with each element as 1 ∥ ∥ a 2: the Euclidean norm of a vector a ∗ A : the complex conjugate of a matrix A AT : the transpose of a matrix A AH : the conjugate transpose of a matrix A tr (A): the trace of a matrix A

λmin (A): the minimal singular value of a matrix A ∥ ∥ A F : the Frobenius norm of a matrix A

∥A∥∗: the nuclear norm of a matrix A vec (A): the vectorization of a matrix A A ⊗ B: the Kronecker product of two matrices A and B A ⊙ B: the Hadamard product of two matrices A and B

IM : the identity matrix of dimension M × M

II.BACKGROUND

A. Matrix Completion × In this section we provide a brief overview of the problem of recovering a rank-r matrix M ∈ Cn1 n2 based on partial knowledge of its entries [12], [13], [14]. Let us deﬁne the observation operation Y = P (M) as  Ω  ∈ [M]ij, (i, j) Ω [Y]ij =  (1) 0, otherwise where Ω is the set of indices of observed entries with cardinality m. According to [13], when M is low-rank and meets certain conditions (see (A0) and (A1), later in this section), M can be estimated by solving a nuclear norm optimization problem

min ∥X∥∗

s.t. PΩ (X) = PΩ (M) (2)

The conditions for successful matrix completion involve the notion of coherence, which is deﬁned as follows.

January 29, 2015 DRAFT 5

Deﬁnition 1. (Matrix Coherence [12]): Let U be a subspace of Cn1 of dimension r that is spanned by

{ ∈ Cn1 } the set of orthogonal vectors ui i=1,...,r, PU be the orthogonal projection onto U, i.e., PU = ∑ H uiui , and ei be the standard basis vector whose i-th element is 1. The coherence of U is deﬁned 1≤i≤r as

n1 2 µ (U) = max ∥PU ei∥ r 1≤i≤n1 [ ] n 2 n ≡ 1 max U(i) ∈ 1, 1 , (3) r 1≤i≤n1 r (i) where U denotes the i-th row of matrix U = [u1,..., ur].

It can be found from Deﬁnition 1 that the minimum possible coherence that any given matrix can ∑r H achieve is one. Let the compact singular value decomposition (SVD) of M be M = ρkukvk , where k=1 ρk, k = 1, . . . , r are the singular values, and uk and vk the corresponding left and right singular vectors, respectively. Let U, V be the subspaces spanned by uk and vk, respectively. Matrix M has coherence with parameters µ0 and µ1 if

(A0) max (µ (U) , µ (V )) ≤ µ0 for some positive µ0. ∑ √ H (A1) The maximum element of the n1 × n2 matrix uivi is bounded by µ1 r/(n1n2) in absolute 1≤i≤r value, for some positive µ1. √ In fact, it was shown in [12] that if (A0) holds, then (A1) also holds with µ1 ≤ µ0 r. × Now, suppose that matrix M ∈ Cn1 n2 satisﬁes (A0) and (A1). The following theorem gives a probabilistic bound for the number of entries, m, needed to estimate M.

× Theorem 1. [12] Suppose that we observe m entries of the rank−r matrix M ∈ Cn1 n2 , with matrix coordinates sampled uniformly at random. Let n = max{n1, n2}. There exist constants C and c such that if { } ≥ 2 1/2 1/4 m C max µ1, µ0 µ1, µ0n nrβ log n for some β > 2, the minimizer to the program of (2) is unique and equal to M with probability at least − 1 − cn β. −1 1/5 For r ≤ µ0 n the bound can be improved to

6/5 m ≥ Cµ0n rβ log n, without affecting the probability of success.

January 29, 2015 DRAFT 6

Theorem 1 implies that the lower the coherence parameter µ0, the fewer entries of M are required to estimate M. Consequently, the intuitive behind the coherence deﬁnition is that for a given matrix, in order to use as small as possible number of its entries for fully recovery with MC, the singular vector of the matrix should be uncorrelated to the standard basis, in other words, the singular vectors need to be “sufﬁciently spread” [12]. To apply the matrix completion techniques to recover a given matrix, the coherence should be as close to unity as possible. In this paper, we say the coherence of a given matrix is optimal so long as it achieves its lowest bound. ∈ In practice, the observations are typically corrupted by noise, i.e., [Y]ij = [M]ij + [E]ij, (i, j) Ω, P P P where, [E]ij represents noise. In that case, it holds that Ω (Y) = Ω (M)+ Ω (E), and the completion of M is done by solving the following optimization problem [14]

min ∥X∥∗ ∥P − ∥ ≤ s.t. Ω (X Y) F δ. (4)

2 2 Assuming that the noise is zero-mean with variance σ , δ > 0 is a parameter deﬁned as δ = (m + √ 8m)σ2 [12]. Let Mˆ be the solution of (4). It can be shown that the error norm M − Mˆ is bounded F as [14] √ (2n n + m) min (n , n ) M − Mˆ ≤ 4 1 2 1 2 δ + 2δ (5) F m with high probability. If M has favorable coherence properties, the matrix completion for both noiseless and noisy cases will be stable. Therefore, to achieve satisfactory performance for matrix completion, it is very important to investigate the conditions under which the coherence of M will be as low as possible.

B. MIMO-MC Radar

We consider the problem formulation proposed in [8] for scheme II. The scenario involves narrowband orthogonal transmit waveforms, transmitted in pulses with pulse repetition interval TPRI and carrier wavelength λ, K far-ﬁled targets at angles θk, and ULAs for transmission and reception, equipped with

Mt transmit and Mr receive antennas, respectively, and inter-element spacing dt and rr, respectively (see Subsection B of Section II in [8]). During each pulse, the m-th, m ∈ N+ antenna transmits a coded waveform containing N symbols Mt

{sm (n)} , n = 1, ··· ,N of duration Tb each, which can be written in the baseband as [ ] ∑N t − (n − 1) T [ ] ϕ (t) = s (n)Λ b , t ∈ 0,T , (6) m m T ϕ n=1 b

January 29, 2015 DRAFT 7

jφm(n) where sm (n) = am (n) e , with {φm (n)} uniformly distributed in [−π, π], and {am (n)} taking arbitrary positive values; Λ(t) is a rectangular pulse and Tϕ = NTb is the duration of the entire pulse. We will assume that the waveforms are sufﬁciently narrowband, i.e., the roundtrip delay from the transmit antennas to the target and back to the receive antennas is smaller than Tb and thus can be ignored.

Suppose that the receive antennas sample the target echoes with sampling interval Tb and forward their samples to a fusion center. Let X be the matrix formulated at the fusion center based on receive antenna samples data with each antenna contributing a row to X. It holds that

X = W + J, (7) where J is an interference/noise matrix and

W = BDAT ST , (8)

× × where A ∈ CMt K is the transmit steering matrix (respectively deﬁned is B ∈ CMr K , the receive steering matrix) deﬁned as   1 1 ··· 1  γ1 γ2 γK     2 2 ··· 2  ∆  γ1 γ2 γK  A =   , (9)  . . .. .   . . . .  Mt Mt ··· Mt γ1 γ2 γK with

t m ∆ j2π(m−1)αk γk = e , (10)

t ∆ dt sin (θk) + + αk = , (m, k) ∈ N × N , (11) λ Mt K t where θk is the angle of the k-th target, or equivalently, αk is the spatial frequency corresponding to the k-th target, ([ ]) ∆ D = diag β1ζ1 β2ζ2 ··· βK ζK , (12) − j2πνk(q 1)TP RI 2ϑk where ζk = e , with q denoting the pulse index, νk = λ denoting the Doppler shift { } { } of the k-th target, and βk k∈N+ , ϑk k∈N+ denoting contain target reﬂection coefﬁcients and speeds, K K [ ] × T S = [s (1) ,..., s (N)]T ∈ CN Mt s (i) = s (i) , . . . , s (i) respectively. with 1 Mt , is the sampled waveform matrix, with its vertical dimension corresponding to sampling along time and its horizontal dimension corresponding to sampling across the array (sampling in space). The i-th row of S can be thought of as the snapshot of the waveforms across the transmit antennas at sampling time i. Due to the SH S = I N ≥ M assumed orthogonality of the waveforms, it holds that Mt when t [28].

January 29, 2015 DRAFT 8

When both Mr and N are larger than K, the noise free data matrix W is rank-K and can be recovered from a small number of its entries via matrix completion. This fact motivated the approach of [7] [8], which calls for subsampling the target echoes at the receive antennas in a uniformly pseudo-random fashion, partially ﬁlling the data matrix at the fusion center, and then completing the data matrix via MC techniques. This approach reduces the number of samples that need to be forwarded to the fusion center. It turns out that the applicability of MC depends on the transmit waveforms. In the next section, we derive the necessary and sufﬁcient conditions the transmit waveforms must satisfy so that the coherence of W is asymptotical optimal, i.e., it approaches 1 as Mt increases.

III.COHERENCEOF W AND OPTIMAL WAVEFORM CONDITIONS

In this section, we analyze the coherence of W, for the MIMO radar system deﬁned in the previous section. In particular,

• We provide sufﬁcient and necessary conditions for the optimal transmit waveforms under which the coherence of W attains its lowest possible value; • Under those conditions, we show asymptotic optimality of the coherence of W w.r.t. the number of transmit/receive antennas; • We show that the coherence of W does not depend on the Doppler shift.

A. The Coherence of W

t Let Si(αk) denote the discrete-time Fourier transform (DTFT) of the i-th snapshot of the transmit t waveforms evaluated at spatial frequency αk, i.e.,

( ) Mt ∑ t t −j2πmαk Si αk = sm (i) e (13) m=1 { } where { sm (i) m∈N+ } are the elements in the i-th row of S. Mt Before we proceed we provide a lemma that will be useful in the subsequent theorems.

Lemma 1. For the MIMO radar system described in Section II(B) with dt = λ/2, and K targets randomly { [ ]} { [ ]} ∈ − π π t ∈ − 1 1 located at angles θk , ∈N+ , or equivalently, at spatial frequencies αk , , it 2 2 k K 2 2 ∈N+ k K holds that N K ( ) ∑ ∑ 2 t Si αk = KMt. (14) i=1 k=1 Proof of Lemma 1. See the Appendix A for the proof.

January 29, 2015 DRAFT 9

The waveforms conditions under which the coherence of W attains its lowest value are summarized in the following theorem.

Theorem 2. (Optimal Waveform Conditions): Consider the MIMO radar systems as deﬁned in Section λ II(B), with dt = 2 . The necessary condition under which the coherence of W attains its lowest possible value is K ( ) ∑ 2 KM S αt = t , for all i ∈ N+ . (15) i k N N k=1 A sufﬁcient condition for the coherence of W to attain its lowest possible value, independent of the target angles is ( ) [ ] 2 M 1 1 S αt = t , for all i ∈ N+ and ∀ αt ∈ − , . (16) i l N N l 2 2

Proof of Theorem 2. To make the proof more tractable, we break it into two parts. In the ﬁrst part, we characterize the SVD of the matrix W in order to identify the actions that are needed in order to bound its coherence. In the second part, we derive the optimal conditions of the coded orthogonal waveforms. 1) Characterization of the SVD of W: The compact SVD of W can be expressed as

W = UΛVH , (17)

Mr ×K N×K H H K×K where U ∈ C , V ∈ C such that U U = IK , V V = IK , and Λ ∈ R is a diagonal matrix containing the singular values of W.

Mr ×K H Consider the QR decomposition of B, i.e., B = QrRr, with Qr ∈ C , such that Qr Qr = IK K×K and Rr ∈ C an upper triangular matrix. Similarly, consider the QR decomposition of SA, i.e., N×K H K×K SA = QsRs, with Qs ∈ C , such that Qs Qs = IK and Rs ∈ C an upper triangular matrix. T ∈ CK×K T H The matrix RrDRs is rank-K and its SVD can be expressed as RrDRs = Q1∆Q2 . Here, K×K H H K×K Q1 ∈ C is such that Q1Q1 = Q1 Q1 = IK (the same holds for Q2) and ∆ ∈ R is non-zero T diagonal, containing the singular values of RrDRs . Therefore, it holds that ( ) H T ∗ H W = QrQ1∆Q2 Qs = QrQ1∆ QsQ2 , (18) ( ) H ∗ H ∗ which is a valid SVD of W since (QrQ1) QrQ1 = IK and QsQ2 QsQ2 = IK . Via the uniqueness ∗ of the singular values of a matrix, it holds that Λ = ∆, thus U = QrQ1 and V = QsQ2. i Let Qr denote the i-th row of of Qr. The coherence of the row space of W is

2 Mr i µ (U) = sup QrQ1 K i∈N+ 2 Mr

January 29, 2015 DRAFT 10

2 Mr i = sup Qr (19) K i∈N+ 2 Mr

It can be seen from (19) that µ (U) is determined by Qr, which is only related to the receive steering matrix B and is independent of the transmit waveform S. In the MIMO radar systems under ULA λ conﬁguration with d = , under the assumption that the target angles set {θ } ∈N+ are distinct with r 2 k k N minimal spatial frequency separation ξ, it was shown in [15] that √ M µ (U) ≤ √ r √ , (20) M − (K − 1) β (ξ ) r Mr r β (ξ ) where Mr r is the Fejér kernel (see (33)). ∗(i) ∗(i) ∈ N+ ∗ ∗ Let Qs and S , i N denote the i-th row of Qs and S , respectively. For the coherence of the row space of W we have

2 N ∗(i) µ (V ) = sup Qs Q2 K i∈N+ 2 N 2 N ∗(i) = sup Qs K i∈N+ 2 N ( ) 2 N ∗(i) ∗ ∗ −1 = sup S A Rs K i∈N+ 2 N 2 ∗(i) ∗ N S A ≤ ( )2 sup 2 ∗ , (21) K ∈N+ i N σmin Rs where ( ) (( ) ) 2 ∗ ∗ H ∗ σmin Rs = λmin Rs Rs ( ) H = λmin Rs Rs ( ) H H = λmin Rs Qs QsRs ( ) H = λmin (SA) SA ( ) H H = λmin A S SA ( ) H = λmin A A . (22)

Here, we use the symbol λmin (·) to denote the minimal eigenvalue of a matrix. In addition, we apply ∗ the fact that the eigenvalues of a Hermitian matrix are real, and the eigenvalues of X are the complex conjugate of the eigenvalues of X. Thus, if X is Hermitian, its eigenvalues are equal to eigenvalues of ∗ X .

January 29, 2015 DRAFT 11

λ In the MIMO radar systems under ULA conﬁguration with dt = 2 , under the assumption that the target angles are distinct with minimal spatial frequency separation ξt, it was shown in [15] that ( ) √ λ AH A ≥ M − (K − 1) M β (ξ ) min t t Mt t (23)

β (ξ ) where Mt t is the kernel deﬁned in (33). Therefore, regarding the coherence of the column space of W, we have

2 ∗(i) ∗ N S A µ (V ) ≤ sup √ 2 . (24) K ∈N+ M − (K − 1) M β (ξ ) i N t t Mt t

2 ∗(i) ∗ ∈ N+ Next, we focus on ﬁnding the minimum of the supremum of S A over i N , which results in 2 the optimal waveform conditions. 2) Optimal Waveform Conditions: The transmit steering matrix has the Vandermonde form given by equation (9). Then we have   K δ ··· δ  1,2 1,Mt     δ K ··· δ  H  2,1 2,Mt  AA =   , (25)  . . .. .   . . . .  δ δ ··· K Mt,1 Mt,2

∑K ′ t ∑K ′ t j2π m −1 −(m−1) α j2π m −m α ′ [( ) ] k ( ) k where δm ,m = e = e . Consequently, it holds that k=1 k=1 ( ) ∗ ∗ 2 ∗ ∗ H ∗ ∗ S (i)A = S (i)A S (i)A 2 ( ) ∗ ∗ ∗ H = S (i)A AT S (i) ( ) H = S(i)AAH S(i)

K M M ∑ ∑t ∑t ′ −j2πmαt ∗ j2πm αt k ′ k = sm (i)e sm (i) e ′ k=1 m=1 m =1 2 ∑K ∑Mt − t j2πmαk = sm (i)e k=1 m=1 K ( ) ∑ 2 t = Si αk , (26) k=1 ( ) t where Si αk was deﬁned in (13).

January 29, 2015 DRAFT 12

Therefore, the coherence bound of the row space of W is ( ) ∑K 2 t sup Si αk N i∈N+ k=1 µ (V ) ≤ N √ . (27) K M − (K − 1) M β (ξ ) t t Mt t µ (V ) The lowest ( possible) coherence bound of can be achieved by finding waveforms that minimize ∑K 2 t sup Si αk . This can be formulated as a min-max optimization problem subject to the constraint ∈N+ i N k=1 given in Lemma 1, i.e., ( ) K ( ) ∑ 2 t min max Si αk S i∈N+ N k=1 N K ( ) ∑ ∑ 2 t s.t. Si αk = KMt (28) i=1 k=1 ( ) ∑K 2 t ≥ ∈ N+ Since Si αk 0, for i N , the optimal solution of the min-max optimization problem is k=1 K ( ) ∑ 2 KM S αt = t , for all i ∈ N+ . (29) i k N N k=1 { } The solution, as shown in (29) depends on the specific target spatial angles αt . Since these k ∈N+ k[ K ] π π angles are not known, we need to consider every possible angle θl in the angle space − , , or every [ ] 2 2 t ∈ − 1 1 αl 2 , 2 . Thus, the optimal waveforms should sufficiently satisfy: ( ) [ ] 2 M 1 1 S αt = t , for all i ∈ N+ and ∀ αt ∈ − , . (30) i l N N l 2 2 The condition of (30) indicates that the power spectrum of each snapshot should be flat in the spatial [ ] t ∈ − 1 1 frequency range αl 2 , 2 , and thus each waveform snapshot must be white noise type sequence with variance Mt/N. This completes the proof of Theorem 2.

Under the optimal waveforms conditions stated in the Theorem 2, the coherence of matrix W is asymptotical optimal, as stated in the following theorem.

Theorem 3. (Coherence of W): Consider the MIMO radar system as presented in Section II-B and K distinct targets. Let the minimum spatial frequency separation of the targets be ( ) dh x = min sin θi − sin θj , h ∈ {t, r} . (31) ∈N+ ×N+ ̸ (i,j) K K ,i=j λ and assume that

|x| ≥ ξh ≠ 0, h ∈ {t, r} . (32)

January 29, 2015 DRAFT 13

Let us also deﬁne 1 sin2 (πM x) β (x) = h , (33) Mh 2 Mh sin (πx) λ For dt = dr = and under the optimal waveform conditions stated in Theorem 2, as long as 2 {√ } M K ≤ min h , (34) h∈{t,r} β (ξ ) Mh h the matrix W obeys the conditions (A0) and (A1) with { √ } ∆ Mh √ µ0 = max √ , (35) h∈{t,r} M − (K − 1) β (ξ ) h Mh h { √ } ∆ MhK√ µ1 = max √ (36) h∈{t,r} M − (K − 1) β (ξ ) h Mh h with probability 1.

Proof of Theorem 3. Following Theorem 2, for waveforms that satisfy the necessary condition (29), it holds that  ( )  ∑K 2  S αt  N i k  µ (V ) ≤ inf  sup k=1 √  S K ∈N+ M − (K − 1) M β (ξ ) i N t t Mt t

N KMt = N√ K M − (K − 1) M β (ξ ) t √ t Mt t M = √ t √ . (37) M − (K − 1) β (ξ ) t Mt t Consequently, under the optimal waveforms conditions, and via inequality (20), we have

∆ µ0 = max (µ (U) , µ (V )) { √ } M = max √ h √ . (38) h∈{t,r} M − (K − 1) β (ξ ) h Mh h √ It was shown in [12] that in the general case, µ1 = µ0 K always holds true. Then, one can choose { √ } √ ∆ MhK√ µ1 = µ0 K = max √ . (39) h∈{t,r} M − (K − 1) β (ξ ) h Mh h Consequently, the conditions (A0) and (A1) hold.

January 29, 2015 DRAFT 14

B. Remarks

1) Asymptotic Optimal Coherence of W β (x) , h ∈ {t, r} x It should be noted that kernel Mh is a periodic function of . In Fig. 1, we plot the β (x) x ≥ 0 M = 10 value of Mh for and h . λ For dt = dr = , the spatial frequency separation corresponding to both transmit and receive arrays, 2 ( ] | | ∈ 1 ∆ { } ̸ deﬁned in (31), satisfy x 0, 2 . If ξ = max ξr, ξt = 0, we can ﬁnd a small constant ξ and

1 sin(πMhξ) 0 < ξ < { } such that the Dirichlet kernel = O (1) and the kernel βM (ξ) satisfies √ min Mt,Mr ( ) sin(πξ) h √sin(πMhξ) √1 βM (ξ) = = O . Consequently, the values of βM (ξ) decrease as Mh, h ∈ {t, r} h Mh sin(πξ) M√h √ h K M ≥ K β (ξ), h ∈ {t, r} increase. Then for any fixed , if h Mh , or equivalently sin (πM ξ) M ≥ K h = O (K) , (40) h sin (πξ) both (20) and (37) hold. Consequently, under the optimal waveform conditions, it holds that √ M lim µ (V ) ≤ lim √ t √ = 1. (41) →∞ →∞ Mt Mt M − (K − 1) β (ξ) t Mt Since µ (V ) ≥ 1, via definition 1, it must hold that under the optimal waveform conditions µ (V ) = 1 in the limit w.r.t. Mt. Similarly, it mush hold that µ (U) = 1 in the limit w.r.t. Mr. As a result, the coherence of W is asymptotically optimal. It should be noted that the spatial frequency separation requirements defined in (32) is not restrictive. For example, in a ULA with M antennas, the spatial frequency separation of targets should be larger 1 than the resolution of the array, i.e., M . As it can be seen in the proof of Theorem 3, Theorem 3 holds even when the spatial frequency separation of the targets is less than the resolution of the array.

λ Mtλ The asymptotic optimality of the coherence of W holds for a ULA configuration with dt = 2 , dr = 2 too. Such configuration is of interest because it achieves MrMt degrees of freedom [4]. In the following we show the asymptotic optimality of the coherence for this case. Under the optimal waveforms, if the 1 minimum spatial frequency separation corresponding to transmit array satisfies |x| ≥ ξt for 0 < ξt < √ ( ) Mt such that β (ξ ) = O √1 , it still holds that Mt t M t √ M lim µ (V ) ≤ lim √ t √ = 1. (42) →∞ →∞ Mt Mt M − (K − 1) β (ξ ) t Mt t For d = Mtλ , the spatial frequency separation corresponding to the receive array, defined in (31), satisfies r( 2 ] |x| ∈ 0, Mt . The period of kernel β (x) , h ∈ {t, r} is 1 and among one period, e.g., x ∈ [k, k + 1], 2 Mh its value is symmetric about x = k + 1 for k = 0, ··· , Mt − 1 (see Fig. 1). In addition, the value of 2 2 [ ] kernel β (x) , h ∈ {t, r} decreases nonmonotonely when x ∈ k, k + 1 and increases nonmonotonely Mh 2

January 29, 2015 DRAFT 15 [ ] ( ) 1 1 sin(πMr ξr ) when x ∈ k + , k + 1 . We can ﬁnd a minimal separation ξr ∈ 0, such that = O (1) √ 2 ( ) Mr sin(πξr ) √1 and thus βM (ξr) = O . If the minimum spatial frequency separation satisﬁes r Mr ∪ M |x| ∈ [k + ξ , k + 1 − ξ ], for k = 0, 1,..., t − 1, (43) r r 2 k it holds that √ M lim µ (U) ≤ lim √ r √ = 1. (44) →∞ →∞ Mr Mr M − (K − 1) β (ξ ) r Mr r Therefore, the asymptotic optimality of coherence of W holds.

The minimal spatial frequency separation[ condition] deﬁned in (43) can be viewed in a probabilistic Mt fashion. If x is uniformly distributed in 0, 2 , the probability that the condition of (43) holds is 1 − 2ξ > 1 − 2 ; the probability tends to 1 as M increases. r Mr r 2) Coherence and Doppler Shift It can be easily seen from Theorem 3 that the coherence of W does not depend on the Doppler shift

{ν } ∈N+ . k k K

C. Comparative Study of Hadamard and Gaussian Orthogonal Waveforms

This section provides a comparative study of Hadamard and randomly generated Gaussian Orthogonal (G-Orth) waveforms in terms of the resulting coherence of W and the corresponding matrix completion × 9 λ performance. Let us consider a ULA transmit array with carrier frequency fc = 1 10 Hz, dt = 2 ,

N×Mt Mt = 40 and N = 64. The corresponding waveform matrix S ∈ C contains Hadamard or G-Orth ◦ ◦ t t waveforms in its columns. Two targets are considered at θ1 = 0 , θ2 = 20 , or equivalently, α1 = 0, α2 = ( ) [ ◦ ◦] 1 sin π . The target angle search space is −90 , 90 and the corresponding spatial frequency range is 2 [9 ] t ∈ − 1 1 ∈ N+ αk 2 , 2 . The power spectra of the i-th waveform snapshot, i N , corresponding to Hadamard and G-Orth waveforms are shown in Fig. 2. It can be seen in Fig. 2 (a) that some power spectra of the snapshots corresponding to Hadamard waveforms attain high values at certain angles, while for the G-Orth case all values are small and spread evenly across all snapshots and angles (see Fig. 2 (c)). Thus, according to (27), the coherence bound µ (V ) corresponding to Hadamard waveforms should be larger than that corresponding to G-Orth waveforms. The lower coherence bound should result in better MC recovery performance for the G-orth waveforms. Indeed, this can be seen in Fig. 3, where the corresponding matrix recovery error is plotted versus the portion of entries of W that were used, i.e., (p). It can be seen in Fig. 3, that for p > 0.25 the recovery error corresponding to G-orth waveforms drops below the inverse SNR level, denoting very good matrix

January 29, 2015 DRAFT 16 recovery performance and even some noise smoothing. In the simulations, the data matrix corresponding to the two targets is recovered via the SVT algorithm [29] for Mr = 128,Mt = 40,N = 64 and SNR = 25dB. The simulation results are averaged over 50 independent runs. In each run, the G-Orth waveforms are randomly generated.

IV. WAVEFORM DESIGNUNDER SPATIAL POWER SPECTRA CONSTRAINTS

Theorem 2 states that, among the class of orthogonal waveforms, and for MIMO radars using ULAs, the optimal waveform matrix should have rows that are white-noise type functions, i.e., the waveform snapshots across the transmit antennas should be white. In this section, we propose a scheme to optimally design the transmit waveform matrix for MIMO-MC radars. In particular, the design problem is formulated as optimization on matrix manifolds [30]. Due to the orthogonality constraint on the transmit waveforms, i.e., the columns of matrix S, the matrix manifold is the complex Stiefel manifold, which is non-convex. The solution can be obtained via the modiﬁed steepest descent algorithm [27], or the modiﬁed Newton algorithm with a nonmonotone line search method [31].

A. Problem Formulation [ ] π π Let us discretize the angle space − , into L phases {θl} ∈N+ , corresponding to the spatial fre- { } 2 2 l L ∗ ∗ quencies αt . Let c = S (i)A (θ ) for i ∈ N+ . According to the optimal condition (30), it holds l ∈N+ il l N l L that ( ) 2 M |c |2 = S αt = t , i ∈ N+ , l ∈ N+. (45) il i l N N L [ ] ∗ ∗ ∗ ∗ ∗ ∗ 2 Deﬁne A = A (θ1) ,..., A (θL) and F = S A . It holds that [F ⊙ F ]il = |cil| . Based on (30), let us deﬁne the objective function

2 ∗ Mt T f (S) = F ⊙ F − 1N 1L . (46) N F The waveform design problem is formulated as

min f(S)

s.t. SH S = I . Mt (47)

Due to the orthogonal constraint, S belongs to the complex Stiefel manifold S (N,Mt), deﬁned as { } × S (N,M ) = S ∈ CN Mt : SH S = I . t Mt (48)

January 29, 2015 DRAFT 17

The nonconvexity of the orthogonal constraint on the complex Stiefel manifold makes the waveform design problem challenging. In the following we adopt the modiﬁed steepest descent algorithm [27], or the modiﬁed Newton algorithm on the Stiefel manifold to solve the problem of (47).

B. Derivative and Hessian of Cost Function f (S)

In this subsection, we will address the derivative and Hessian of the cost function f (S) deﬁned in (46) w.r.t. the variables S. First, based on the second order Taylor series approximation (see [27]), the × cost function f : CN Mt → R can be written as { ( )} H f (S + δZ) =f (S) + δℜ tr Z DS { } ( ) δ2 δ2 + vec(Z)H H vec (Z) + ℜ vec(Z)T C vec (Z) + O δ3 , (49) 2 S 2 S

N×Mt NMt×NMt where DS ∈ C is the derivative of f evaluated at S, and the matrix pair HS, CS ∈ C are H T the Hessian of f evaluated at S. To ensure uniqueness, we require HS = HS , CS = CS .

The complex-valued derivative DS is used in the modiﬁed steepest descent method. To calculate the Newton direction with the standard Newton method [32], we will use the second order Taylor series expansion of the function f : R2NMt → R in the commonly known vector form with real-valued elements as ( ) δ2 f (s + δz) = f (s) + δzT d + zT Hz + O δ3 , (50) 2 where the vector s ∈ R2NMt is deﬁned as     re ℜ{ } ∆ s ∆ vec (S) s =   =   . (51) sim ℑ{vec (S)}

× In the above, d ∈ R2NMt is the derivative of f (s) evaluated at s, and H ∈ R2NMt 2NMt is the Hessian of f (s) evaluated at s (for deﬁnitions, see Section IV-D). The derivatives and Hessians developed in the above two Taylor series expansion forms of (49) and (50) can be transformed into each other [33]. In the following, we list the derivative and Hessian of the cost function f (S). The derivation details are given in Appendix B. {[( ) ] } ∗ ∗ T H DS =2 S A ⊙ (SA) − N ⊙ Y A , (52) ( ( )) ( ) ( ∗) H =4P I ⊗ A diag vec 2H˜ T − NT I ⊗ AT P , S Mt×N N N N×Mt (53) ( ) ( ( ∗ ∗)) C =4P (I ⊗ A) diag vec Y ⊙ Y I ⊗ AT P , S Mt×N N N N×Mt (54)

January 29, 2015 DRAFT 18 ( ) M T ∗ ∗ N×L T T L×N where N = t 1 1 , H˜ = S A ⊙ (SA) ∈ R and Y = A S ∈ C . In the above, P × N N L N Mt is a commutation matrix, such that ( ) vec ZT = P vec (Z) , N×Mt (55) which can be expressed as [34]

∑N ∑Mt ( ) P = E ⊗ ET , N×Mt mn mn (56) m=1 n=1 where Emn is a matrix of dimension N × Mt with 1 at its mn-th position and zeros elsewhere. It holds P = PT H = HH , C = CT that N×Mt Mt×N . It is easy to verify that S S S S .

C. Modiﬁed Steepest Descent on the Complex Stiefel Manifold

Here, we apply the modiﬁed steepest descent method of [27] to solve the optimization problem of (47).

Let TS (N,Mt) denote the tangent space, i.e., the plane that is tangent to the complex Stiefel manifold at point S ∈ S (N,Mt) [26]. The inner product in the tangent space is deﬁned using the canonical metric [26] in the complex-value case, i.e., { [ ( ) ]} 1 ⟨Z , Z ⟩ = ℜ tr ZH I − SSH Z , (57) 1 2 2 2 1 for Z1, Z2 ∈ TS (N,Mt). k k Let Z ∈ TS (N,Mt) be the steepest descent at point S ∈ S (N,Mt) in the k-th iteration. The steepest descent algorithm starts from point Sk and moves along Zk with a step size δ, i.e.,

Sk+1 = Sk + δZk. (58)

Sk+1 To preserve the orthogonality during the( update steps,) the new point is projected back to the complex Stiefel manifold, i.e., Sk+1 = Π Sk + δZk , where Π is the projection operator. For a matrix × N Mt ˜ ˜ H S ∈ C with N ≥ Mt and with SVD S = UΣV , the point in the Stiefel manifold that is nearest to S in the Frobenius norm sense is given by Π(S) = UI˜ V˜ H [27]. N,Mt ( ) ( ( )) g Zk = f Π Sk + Zk The modiﬁed steepest descent is deﬁned as follows( [27].) Let be the local k k k cost function for S ∈ S (N,Mt). The gradient of g Z at Z = 0 under the canonical inner product (57) is ( ) ( ) ( ( )) H ˜ k k k k k ∇Sf S = ∇Sf S − S ∇Sf S S , (59)

∇ f (S) = D f (S) where S ( ) S denotes the derivative of (see (52)). Then, the modiﬁed steepest descent is k ˜ k Z = −∇Sf S .

January 29, 2015 DRAFT 19

The step size δ is chosen using a nonmonotone line search method based on [31], i.e., so that ( ( )) ⟨ ( ) ⟩ k k ˜ k k f Π S + δZ ≤ Ck + βδ ∇Sf S , Z , (60) ⟨ ( ( )) ⟩ ⟨ ( ) ⟩ ˜ k k k ˜ k k ∇Sf Π S + δZ , Z ≥ σ ∇Sf S , Z . (61) ( ) ( ) ( ) 0 1 k Here, Ck is taken to be a convex combination of the function values f S , f S , . . . , f S , i.e., [ ( )]/ k+1 Ck+1 = ηQkCk + f S Qk+1, (62) ( ) 0 where Qk+1 = ηQk + 1, C0 = f S and Q0 = 1. In the above, the parameter η controls the degree of nonmonotonicity. When η = 0, the line search is the usual monotone Wolfe or Armijo line search [35]. When η = 1, then ( ) 1 ∑k C = f Si+1 . (63) k k + 1 i=0 The modiﬁed steepest descent algorithm is summarized in Algorithm 1.

Algorithm 1 Modiﬁed steepest descent algorithm 0 1: Initialize S ∈ S (N,M ) α, η, ϵ ∈ (0, 1) 0 < β < σ < 1 δ = 1,C = ( ) : Choose t and parameters , . Set 0 f S0 ,Q = 1, k = 0 0 . ( ) k k 2: Descent direction update Z = −∇˜ f S ⟨ : Compute⟩ the descent direction as S via equation (59). k k 3: Convergence test Z , Z ≤ ϵ : If , then stop. ( ) ( ) ( ) k+1 k k k+1 k+1 4: Line search update S = Π S + δZ ∇˜ f S f S ≥ ⟨ ( ) ⟩ : Compute⟨ ( ) ⟩ ⟨ and( S) ⟩ . If ˜ k k ˜ k+1 k ˜ k k βδ ∇Sf S , Z + Ck and ∇Sf S , Z ≤ σ ∇Sf S , Z , then set δ = αδ and 4 repeat Step . [ ( )]/ k+1 5: Cost update Q = ηQ + 1 C = ηQ C + f S Q : k+1 k( , k+1) k k k+1. k+1 k k 6: Perform update S = Π S + δZ , k = k + 1. Go to Step 2.

D. Modiﬁed Newton Algorithm on the Complex Stiefel Manifold

With expressions for the derivative and Hessian of the cost function given in (52 ), (53) and (54), we can now formulate the Newton method [32] to solve the waveform design problem of (47). × First, the Newton search direction is calculated as follows. Let Zk ∈ CN Mt denote the Newton search direction in the k-th iteration. In a similar way as in [36], we arrange the complex-valued elements of

January 29, 2015 DRAFT 20

k k Z into a real-valued vector z of length 2NMt, deﬁned as    { ( )}  k ℜ k ∆ zre ∆ vec Z zk =   =  { ( )}  . (64) k k zim ℑ vec Z Let us also deﬁne the real-valued vector  { ( )}  ℜ ∆ vec D k dk =  { ( S )}  , (65) ℑ vec DSk and the real-valued matrix  { } { }  ℜ −ℑ ∆ H k + C k H k + C k Hk =  { S S } { S S}  . (66) ℑ − ℜ − HSk CSk HSk CSk Following the standard Newton method, the vector zk is computed as [ ] −1 k k k z = − H + σkI d , (67)

σ ≥ 0 Hk + σ I where k is chosen to make the matrix k positive( deﬁnite. Consequently,) the complex-valued Zk = mat zk + jzk Newton search direction can be found as N×Mt re im , corresponding to the inverse vector operation deﬁned in (64). In the k-th iteration, the standard Newton method performs the update

Sk+1 = Sk + δZk, (68) where δ is the step size. Since the waveforms are on the complex Stiefel manifold, to preserve the orthogonality in the modiﬁed Newton method, the new point Sk+1 is projected back to the complex Stiefel manifold, i.e., ( ) Sk+1 = Π Sk + δZk . (69)

It should be pointed out that the Hessian matrix defined in (66) is not always positive definite. In each step we choose σk such that Hk + σkI is positive definite. If matrix Hk has nonpositive eigenvalues, σk should be larger than −λmin (Hk), where λmin is the minimal eigenvalue of Hk. In the local area of the minimum, the modified Newton update will approach the pure Newton step. However, if σk is chosen very large, the modified Newton search direction will be close to the negative steepest descent. Last, the step δ size could be obtained using a similar nonmonotone line search method [31]. Throughout the( modified) N×Mt H Newton method, the inner product of two matrices Z1, Z2 ∈ C is defined as ⟨Z1, Z2⟩ = tr Z1 Z2 . The modified Newton algorithm is summarized in Algorithm 2.

January 29, 2015 DRAFT 21

Algorithm 2 Modiﬁed Newton algorithm 0 1: Initialize S ∈ S (N,M ) α, η, ϵ ∈ (0, 1) 0 < β < σ < 1 δ = 1,C = ( ) : Choose t and parameters , . Set 0 0 f S ,Q0 = 1, k = 0.

2: Compute the derivative DSk with equation (52) as well as Hessian HSk , CSk with equations (53) and (54). ⟨ ⟩ ≤ 3: Convergence test: If DSk , DSk ϵ, then stop. k 4: Newton search direction computation: Compute the real-valued vector d with (65), as well as the Hk zk real-valued matrix with (66).( Compute) the vector with (67). The Newton search direction is k k k arranged as Z = mat × z + jz . N Mt re im ( ) ( ) k+1 k k k+1 5: Line search update: Compute S = Π S + δZ and D k+1 . If f S ≥ {⟨ ⟩} {⟨ ⟩} {⟨ ⟩} S ℜ k ℜ k ≤ ℜ k βδ DSk+1 , Z + Ck and DSk+1 , Z σ DSk , Z , then set δ = αδ and repeat 5 Step . [ ( )]/ k+1 6: Cost update Q = ηQ + 1 C = ηQ C + f S Q : k+1 k( , k+1) k k k+1. k+1 k k 7: Perform update S = Π S + δZ , k = k + 1. Go to Step 2.

V. NUMERICAL RESULTS

In this section, we provide some numerical results to demonstrate the proposed waveform matrix design schemes and evaluate their performance in MIMO-MC radars. Both transmit and receive antennas are λ × 9 conﬁgured as ULAs with dt = dr = 2 and carrier frequency fc = 1 10 Hz.

A. Performance Comparison of Waveform Design Methods

We first design the waveform matrix by applying the modified steepest descent method. We take [ ] ◦ ◦ ◦ t N = 64,Mt = 40 for DOA space −10 : 1 : 10 ; correspondingly, αk ∈ [−0.0868, 0.0868], the number of discretized angles is L = 21 and the steering matrix A has a dimension 40 × 21. In the nonmonotone line search, we chose β = 0.01 and σ = 0.99. These values are selected by trial and error to satisfy the Wolfe conditions (60) and (61). In addition, we set α = 0.5 to adjust the step size, and − ϵ = 10 5 as the stopping check value. The initial step size is set to δ = 0.1. The iteration is initialized with a column-wise Hadamard matrix S0 ∈ S (64, 40), i.e., a matrix that has Hadamard sequences in its columns. The convergence of the proposed modified steepest descent algorithm for η = 1, 0.5, 0 is shown in Fig. 4 (a). As it can be seen from Fig. 4 (a), the objective value f under η = 0.5 decreases the fastest, while under η = 1 decreases very slowly for number of iterations less than 2000. The simulation results

January 29, 2015 DRAFT 22 indicate that the performance of the line search method could be improved if the historical objective values in each iteration are partially utilized, as indicated in (62). The simulation results also show that the value of the objective function, f, approaches its global minimal, i.e., 0. The corresponding optimal solution, S, is not unique, and depends on the initial point and the step size. However, as it will be seen later (see Fig. 7 (a) for an example), all solutions result in very similar MC recovery performance. Since the complex Stiefel manifold is not a convex set, there is no guarantee that the algorithms will

Mt(Mt+1) converge to the global minimum. In the problem of (47), the number of equations is 2 + NL, which should be less than the total available combinations 2MtN. Consequently, to make the objective 2 − Mt +Mt function zero, it must hold that L < 2Mt 2N . Our simulations show that for the entire DOA [ ◦ ◦ ◦] space −90 : 1 : 90 , corresponding to L = 181, when the dimension of S is relatively small, e.g., 2 − Mt +Mt ≈ N = 64,Mt = 40 and therefore L > 2Mt 2N 67, the objective value gets stuck to local minima; ◦ however, if the spacing increases, for example to 5 , corresponding to L = 37, the iteration converges to the global minimum. If the dimension is relatively large, e.g., N = 512,Mt = 500, even for small ◦ spacing, i.e., 1 , the objective value converges to its global minimum (see Fig. 4 (b) for η = 0). Next we provide a numerical example to compare the performance of the modified steepest descent algorithm and the modified Newton algorithm. Since the complexity of the Newton method increases with [ ◦ ◦ ◦] the size of the matrix, we do the comparison for N = 32,Mt = 20 for DOA space −5 : 1 : 5 .A × column-wise Hadamard waveform matrix S0 ∈ C32 20 is used as initial search point for both algorithms. The performance comparison is illustrated in Fig. 5, where it can be seen that the value of the objective function f (S) under the modified Newton algorithm decreases much faster than that under the modified steepest descent algorithm.

B. Spatial Power Spectra of Optimized Waveform Snapshots

The power spectra of the optimized waveform snapshots, i.e., the rows of the optimized waveform matrix S, are plotted in Fig. 6 (b) and (d) for the same parameters deﬁned in Section V-A, i.e., Mt = [ ] ◦ ◦ ◦ t 40,N = 64 for DOA space −10 : 1 : 10 , corresponding to αk ∈ [−0.0868, 0.0868]. In the simulation, a waveform matrix that is either column-wise G-Orth or Hadamard, is used as the initial search point, respectively. The power spectra of the rows of the initial waveform matrix ﬂuctuate over different DOAs (see Fig. 6 (a) and (c), respectively). It can be seen in Fig. 6 (b) and (d) that the optimized waveform

Mt matrix yields almost ﬂat row power spectra with value N = 0.625. According to Theorem 2, with the optimized waveforms, the coherence of matrix W nearly achieves its minimum value.

January 29, 2015 DRAFT 23

C. Matrix Recovery Error Performance

Continuing on the scenario of Section V-B, we look at the MC performance corresponding to the optimized waveform matrix as function of the portion of observed entries, p. We consider K = 2 targets ◦ ◦ located at θ1 = −2 , θ2 = 2 . The number of receive antennas is Mr = 64 and the SNR is set to 25dB. The simulation results are averaged over 50 independent runs, where in each run, the noise is randomly generated. In the simulations, the data matrix is recovered via the SVT algorithm of [29]. Figure 7 (a) shows the recovery error, suggesting that the optimized waveform matrix results in signiﬁcantly better performance as compared to the column-wise Hadamard matrix, especially for small values of p. One can see that in order to achieve an error around 5%, MC with the optimized waveforms requires about 20% of the data matrix entries, while MC with a column-wise Hadamard matrix requires more than 70% of the data matrix entries. Also, although the optimized waveforms are not unique, all solutions result in almost identical MC performance. In the same ﬁgure, we also compare the optimized waveforms against column-wise Gaussian Orthogonal (G-Orth) waveforms. One can see that the former result in lower MC recover error for smaller p’s, while their advantage diminishes for higher p’s. Further simulations suggest that range of p over which the optimized waveforms have an advantage over the G-Orth waveforms shrinks as

Mt increases. As the number of transmit antennas increases, this observation suggests that the G-Orth waveforms behave like optimal in the sense that they achieve a comparable MC performance as the optimized waveforms without involving high computational complexity. Although the waveform design requires angle space discretization, the sensitivity due to targets falling off grids is rather low. The recovery error comparison between K = 2 on-grid targets located at θ1 = ◦ ◦ ◦ ◦ −2 , θ2 = 2 and off-grid at θ1 = −2.5 , θ2 = 2.5 is shown in Fig. 7 (b). The other simulation settings are the same as in Fig. 7 (a). It can be found that the performance of off-grid targets is almost identical to that of on-grid targets.

D. Coherence Properties Under Optimized Waveforms

In Fig. 8, we plot the coherence µ (V ) of matrix W and its bound, deﬁned in (37), versus the number [ ◦ ◦ ◦ ◦] of transmit antennas, for K = 4 targets located at −10 , −5 , 5 , 10 . The optimized waveforms for different values of Mt are obtained by solving the problem of (47) via Algorithm 1 focusing on DOA [ ◦ ◦ ◦] space −10 : 1 : 10 , i.e., L = 21. For comparison, the coherence µ (V ) under the G-Orth waveform matrix is also plotted, where the results are averaged over 100 independent implementations, and in each implementation the waveforms are generated randomly. It can be found that the averaged coherence under

January 29, 2015 DRAFT 24

G-Orth waveforms is higher than the coherence under optimized waveforms over the entire Mt range. On the other hand, under the optimized waveforms, our simulations show that for different number of targets, the coherence is always bounded by the bound of (37) and approaches its smallest value (not necessarily in a monotone way) when Mt increases. The simulation results in Fig. 8 conﬁrm the conclusions in Theorem 3, i.e., when the waveforms satisfy the optimal waveform conditions stated in Theorem 2, the matrix coherence µ (V ) is asymptotically optimal w.r.t. Mt. We should note, however, that the rather big coherence difference between the optimized and the G-Orth waveforms, does not translate in to substantial difference in terms of matrix recovery error. Indeed, the G-Orth waveforms perform very closely with the optimized ones when Mt becomes larger.

VI.CONCLUSIONS

In this paper, we have presented an analysis of the coherence of the data matrix arising in MIMO- MC radar with ULA configurations and transmitting orthogonal waveforms. We have shown that, the data matrix attains its lowest possible coherence if the waveform snapshots across the transmit array have flat power spectra for all time instances. The waveform design problem has been approached as an optimization problem on the complex Stiefel manifold and has been solved via the modified steepest descent algorithm and the modified Newton algorithm. The numerical results have shown that as the number of antennas increases, the optimized waveforms result in optimal data matrix coherence, i.e., 1, and thus, only a small portion of samples are needed for the data matrix recovery. For a particular array, the optimal waveforms depend on the target space to be investigated; for different regions of the target space, the corresponding optimal waveforms can be constructed a priori. Since their construction involves high computational complexity, the optimal waveforms can be used as benchmark against easily constructed waveforms. For example, our simulations revealed that as the number of transmit antennas increases, simply transmitting G-Orth waveforms results in comparable matrix recovery performance as transmitting optimized waveforms. Thus, given the cost of computing the optimized waveforms, certain applications and under certain conditions may treat G-Orth waveforms as optimal.

APPENDIX A

PROOFOF LEMMA 1

Proof. Assume the MIMO radar systems are conﬁgured with ULA transmit array of size Mt and inter- { [ ]} ∈ − π π element spacing as dt. There are K targets in the far-ﬁeld at DOAs θk , ∈N+ , corresponding 2 2 k K

January 29, 2015 DRAFT 25 { [ ]} to spatial frequencies αt ∈ − 1 , 1 . Then the transmit steering matrix A has the Vandermonde k 2 2 ∈N+ k K ( ) H form deﬁned in (9). As a result, it holds that tr AA = KMt. × S ∈ CN Mt SH S = I Suppose that orthogonal waveforms are transmitted so that . Since Mt , it holds that  ∑N  ′ ∗ 1, m = m ′ + s ′ (i) s (i) = , m, m ∈ N . (70) m m  ′ Mt i=1 0, m ≠ m Consequently, N ( ) ∑ H S(i) S(i) = I , Mt (71) i=1 where S(i) denotes the i-th row of S. Following the equation (26), it holds that N K ( ) N ∑ ∑ 2 ∑ 2 t ∗(i) ∗ Si αk = S A 2 i=1 k=1 i=1 ( ) ∑N ( ) ∗ H ∗ ∗ = tr AT S (i) S (i)A i=1 ( ) ∑N ( ) ∗ H ∗ ∗ = tr S (i) S (i)A AT i=1 N (( ) ) ∑ H = tr S(i) S(i)AAH i=1(( ) ) N ( ) ∑ H = tr S(i) S(i) AAH ( i=1 ) = tr AAH

= KMt. (72)

Thus the statements in Lemma 1 follow.

APPENDIX B

DERIVATIVE AND HESSIANOF f (S)

First, we give the following two Lemmas, which will be applied in ﬁnding the the derivation.

× × × × Lemma 2. Let A ∈ CMt L, Z ∈ CN Mt , Y ∈ CL N , H ∈ RN L be arbitrary matrices. It can be shown that { [( ) ]} { [( ) ]} tr H AH ZH ⊙ Y = tr ZH H ⊙ YT AH . (73)

January 29, 2015 DRAFT 26

Proof. See the Appendix C.

× × Lemma 3. Let H˜ ∈ CN L and G, M ∈ CL N be general matrices with arbitrary elements. It holds that ( ) ( ) tr (G ⊙ M) H˜ = [vec (G)]T vec M ⊙ H˜ T . (74)

Proof. See the Appendix D.

∗ Since F ⊙ F and N are real-valued matrices, the objective function f (S) deﬁned in (46) can be written as {( )( ) } ∗ ∗ T f (S) = tr F ⊙ F − N F ⊙ F − N . (75)

In order to ﬁnd the derivative and Hessian of f (S), we do the following expansion: { } [( ∗ ∗) ][( ∗ ∗) ] f (S + δZ) =tr (S + δZ) A ⊙ ((S + δZ) A) − N (S + δZ) A ⊙ ((S + δZ) A) − N T ( ) { ′} =f (S) + δtr {T} + δ2tr T + O δ3 . (76)

Here, { } {[( ∗ ∗) ] } [( ∗ ∗) ] [( ∗ ∗) ] T = S A ⊙ (SA) − N S A ⊙ (ZA) T + S A ⊙ (ZA) H { } {[( ∗ ∗) ] [( ∗ ∗) ]∗} [( ∗ ∗) ] + S A ⊙ (ZA) + S A ⊙ (ZA) S A ⊙ (SA) T − NT , (77) { } ′ [( ∗ ∗) ] [( ∗ ∗) ] [( ∗ ∗) ] [( ∗ ∗) ] T T = S A ⊙ (SA) Z A ⊙ (ZA) T + S A ⊙ (SA) Z A ⊙ (ZA) T { } [( ∗ ∗) ] [( ∗ ∗) ] [( ∗ ∗) ] [( ∗ ∗) ] T + S A ⊙ (ZA) Z A ⊙ (SA) T + S A ⊙ (ZA) Z A ⊙ (SA) T { } [( ∗ ∗) ] [( ∗ ∗) ] [( ∗ ∗) ] [( ∗ ∗) ] H + S A ⊙ (ZA) S A ⊙ (ZA) T + S A ⊙ (ZA) S A ⊙ (ZA) T { } [( ∗ ∗) ] [( ∗ ∗) ] T − Z A ⊙ (ZA) NT − Z A ⊙ (ZA) NT . (78)

Thus, it holds that ( { }) {[( ∗ ∗) ] } [( ∗ ∗) ] [( ∗ ∗) ] tr (T) =tr 2 S A ⊙ (SA) − N S A ⊙ (ZA) T + S A ⊙ (ZA) H { ( [( ) ( )])} =ℜ tr H AH ZH ⊙ AT ST , (79)

[( ∗ ∗) ] × where H = 2 S A ⊙ (SA) − N ∈ RN L. Let Y = AT ST . Following Lemma 2, it holds that { ( [( ) ])} tr (T) = ℜ tr ZH H ⊙ YT AH . (80)

January 29, 2015 DRAFT 27

In addition, it holds that ( ) ( ) ( ′) [( ∗ ∗) ] [( ∗ ∗) ] [ ( ∗ ∗)] [( ∗ ∗) ] tr T =2tr S A ⊙ (SA) Z A ⊙ (ZA) T + 2tr (SA) ⊙ Z A S A ⊙ (ZA) T { ( )} ( [( ) ( )]) [( ∗ ∗) ] [( ∗ ∗) ] + 2ℜ tr S A ⊙ (ZA) S A ⊙ (ZA) T − 2tr N AH ZH ⊙ AT ZT . (81) Now, we focus on the ﬁrst term on the right side of equation (81). Following Lemma 3, it holds that ( ) [( ∗ ∗) ] [( ∗ ∗) ] tr S A ⊙ (SA) Z A ⊙ (ZA) T ([( ) ( )] ) =tr AH ZH ⊙ AT ZT H˜ [ ( )] (( ) ) T = vec AH ZH vec AT ZT ⊙ H˜ T , (82)

( ∗ ∗) × where H˜ = S A ⊙ (SA) ∈ RN L. Further, via equations (55) and (109), it holds that [ ( )] [( ) ( )] T T H H H H vec A Z = IN ⊗ A vec Z {( )[ ( )]∗}T H T = IN ⊗ A vec Z {( ) } [ ]∗ T = I ⊗ AH P vec (Z) N N×Mt

( ∗) =[vec (Z)]H P I ⊗ A , Mt×N N (83) as well as (( ) ) ( ( )) ( ) vec AT ZT ⊙ H˜ T =diag vec H˜ T vec AT ZT ( ( )) ( ) ( ) ˜ T T T =diag vec H IN ⊗ A vec Z ( ( )) ( ) =diag P vec H˜ I ⊗ AT P vec (Z) . N×L N N×Mt (84)

Consequently, it holds that ([( ) ( )] ) tr AH ZH ⊙ AT ZT H˜ ( ( )) ( ) ( ∗) =[vec (Z)]H P I ⊗ A diag P vec H˜ I ⊗ AT P vec (Z) . Mt×N N N×L N N×Mt (85)

Let us focus on the second term on the right side of equation (81). It holds that ( ) [ ( ∗ ∗)] [( ∗ ∗) ] tr (SA) ⊙ Z A S A ⊙ (ZA) T ([( ) ( )] [( ) ( )]) H =tr AH SH ⊙ AT ZT AH SH ⊙ AT ZT ([ ( )] [ ( )]) ∗ H ∗ =tr Y ⊙ AT ZT Y ⊙ AT ZT

January 29, 2015 DRAFT 28 [ ( ( ))] ( ( )) ∗ H ∗ = vec Y ⊙ AT ZT vec Y ⊙ AT ZT , (86) as well as ( ( )) ( ) ∗ ( ( ∗)) vec Y ⊙ AT ZT =diag vec Y vec AT ZT ( ( )) ( ) ( ) ∗ T T =diag vec Y IN ⊗ A vec Z ( ) ( ( ∗)) =diag vec Y I ⊗ AT P vec (Z) . N N×Mt (87)

Consequently, it holds that ([ ( )] [ ( )]) ∗ H ∗ tr Y ⊙ AT ZT Y ⊙ AT ZT ( ) ( ∗)[ ( ( ∗))] ( ( ∗)) =[vec (Z)]H P I ⊗ A diag vec Y H diag vec Y I ⊗ AT P vec (Z) Mt×N N N N×Mt ( ) ( ∗) ( ( ∗)) =[vec (Z)]H P I ⊗ A diag vec Y ⊙ Y I ⊗ AT P vec (Z) . Mt×N N N N×Mt (88)

Next, let us focus on the third term on the right side of equation (81). With equations (109) and (87), it holds that ( ) [( ∗ ∗) ] [( ∗ ∗) ] tr S A ⊙ (ZA) S A ⊙ (ZA) T [ ( )] ( ) [( ∗ ∗) ] T [( ∗ ∗) ] = vec S A ⊙ (ZA) T vec S A ⊙ (ZA) T [ ( ( ))] ( ( )) ∗ T ∗ = vec Y ⊙ AT ZT vec Y ⊙ AT ZT ( ) [ ( ( ∗))] ( ( ∗)) =[vec (Z)]T P (I ⊗ A) diag vec Y T diag vec Y I ⊗ AT P vec (Z) Mt×N N N N×Mt ( ) ( ( ∗ ∗)) =[vec (Z)]T P (I ⊗ A) diag vec Y ⊙ Y I ⊗ AT P vec (Z) . Mt×N N N N×Mt (89)

Finally, let us focus on the forth term on the right side of equation (81). Via equations (102) (83) (84) and Lemma 3, it holds that ( [( ) ( )]) ([( ) ( )] ) tr N AH ZH ⊙ AT ZT = tr AH ZH ⊙ AT ZT N [ ( )] (( ) ) T = vec AH ZH vec AT ZT ⊙ NT ( ) ( ∗) =[vec (Z)]H P I ⊗ A diag (P vec (N)) I ⊗ AT P vec (Z) . Mt×N N N×L N N×Mt (90)

Therefore, it holds that { ( [( ) ])} f (S + δZ) = f (S) + δℜ tr ZH H ⊙ YT AH ( ( )) ( ) ( ∗) ∗ + 2δ2[vec (Z)]H P I ⊗ A diag vec H˜ T + Y ⊙ Y − NT I ⊗ AT P vec (Z) Mt×N N N N×Mt

January 29, 2015 DRAFT 29 { ( ) } ( ) ( ( ∗ ∗)) + 2δ2ℜ [vec (Z)]T P (I ⊗ A) diag vec Y ⊙ Y I ⊗ AT P vec (Z) + O δ3 . Mt×N N N N×Mt (91) By coefﬁcient comparison between (91) and the matrix form of the second-order Taylor series (49), we ﬁnally obtain ( ) T H DS = H ⊙ Y A , ( ( )) ( ) ( ∗) H =4P I ⊗ A diag vec 2H˜ T − NT I ⊗ AT P , S Mt×N N N N×Mt ( ) ( ( ∗ ∗)) C =4P (I ⊗ A) diag vec Y ⊙ Y I ⊗ AT P , S Mt×N N N N×Mt ( ) ( ) [( ∗ ∗) ] ∗ where H = 2 S A ⊙ (SA) − N , Y = AT ST and the fact H˜ T = Y ⊙ Y = AT ST ⊙ AH SH H T is applied. It is easy to verify that HS = HS , CS = CS .

APPENDIX C

PROOFOF LEMMA 2

Mt×L Proof. We use aij, zij, yij, hij to denote the ij-th element of the corresponding matrices A ∈ C , Z ∈ × × × CN Mt , Y ∈ CL N , H ∈ RN L, respectively. It holds that ( ) { [( ) ]} ∑N ∑L ∑Mt H H ∗ ∗ tr H A Z ⊙ Y = hmn ainzmi ynm m=1 n=1 ( i=1 ) ∑Mt ∑N ∑L ∗ ∗ = zmi hmnynmain . (92) i=1 m=1 n=1 { [( ) ]} { } N×Mt H H H Let DS ∈ C such that tr H A Z ⊙ Y = tr Z DS . We use dij to denote the ij-th element of DS and it holds that

{ } ∑Mt ∑N H ∗ tr Z DS = zmidmi. (93) i=1 m=1 By comparing equations (92) and (93), we have ∑L ∗ dmi = hmnynmain. (94) n=1

As a result, the matrix DS has the form as ( ) T H DS = H ⊙ Y A . (95) { [( ) ]} { [( ) ]} Consequently, it holds that tr H AH ZH ⊙ Y = tr ZH H ⊙ YT AH , which completes the proof.

January 29, 2015 DRAFT 30

APPENDIX D

PROOFOF LEMMA 3 ˜ L×N Proof. We use gij, mij, hij to denote the ij-th element of the corresponding matrices G, M ∈ C × and H˜ ∈ CN L. Then, it holds that { } ∑L ∑N ˜ ˜ tr (G ⊙ M) H = gjimjihij. (96) j=1 i=1 On the other hand, it holds that ( ) [ ] T ˜ T ˜ ˜ ˜ ˜ vec M ⊙ H = m11h11, ··· , mL1h1L, ··· , m1N hN1, ··· , mLN hNL , (97)

T vec (G) = [g11, ··· , gL1, . . . , g1N , ··· , gLN ] . (98)

Consequently, it holds that ( ) ∑N ∑L T ˜ T ˜ vec(G) vec M ⊙ H = gjimjihij i=1 j=1 ∑L ∑N ˜ = gjimjihij. (99) j=1 i=1 By comparing equations (96) and (99), we have { } ( ) tr (G ⊙ M) H˜ = vec(G)T vec M ⊙ H˜ T , (100) which completes the proof.

APPENDIX E

USEFUL EQUATIONS

Here, we list some useful equations for deriving the derivative and Hessian of a matrix-valued cost × function with Z ∈ CN Mt . They are ( ) ∥ ∥2 H A F = tr AA , (101)

tr (ZH) = tr (HZ) , (102) ( ) tr AT = tr (A) , (103) ( ) ( ∗) ∗ tr AH = tr A = (tr (A)) , (104)

(H ⊗ Z)T = HT ⊗ ZT , (105)

(H ⊙ Z)T = HT ⊙ ZT , (106)

January 29, 2015 DRAFT 31 [ ( )] T tr (HZ) = vec HT vec (Z) , (107) ( ) tr HH Z = [vec (H)]H vec (Z) , (108) ( ) vec (HZG) = GT ⊗ H vec (Z) . (109)

REFERENCES

[1] S. Sun and A. P. Petropulu, “On the applicability of matrix completion on MIMO radars,” in 48th Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, Nov. 2-5 2014. [2] S. Sun and A. P. Petropulu, “On waveform design for MIMO radar with matrix completion,” in Proc. of IEEE Global Conference on Signal and Information Processing (GlobalSIP), Atlanta, GA, USA, Dec. 3-5 2014. [3] J. Li and P. Stoica, “MIMO radar with colocated antennas,” IEEE Signal Process. Mag., vol. 24, no. 5, pp. 106–114, 2007. [4] J. Li, P. Stoica, L. Xu, and W. Roberts, “On parameter identifiability of MIMO radar,” IEEE Signal Process. Lett., vol. 14, no. 12, pp. 968–971, 2007. [5] P. Stoica, J. Li, and Y. Xie, “On probing signal design for MIMO radar,” IEEE Trans. Signal Process., vol. 55, no. 8, pp. 4151–4161, 2007. [6] N. Levanon and E. Mozeson, Radar Signals, Hoboken, NJ, Wiley, 2004. [7] S. Sun, A. P. Petropulu, and W. U. Bajwa, “Target estimation in colocated MIMO radar via matrix completion,” in Proc. of IEEE 38th Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), Vancouver, Canada, May 2013. [8] S. Sun, W. U. Bajwa, and A. P. Petropulu, “MIMO-MC radar: A MIMO radar approach based on matrix completion,” IEEE Trans. Aerosp. Electron. Syst., accepted for publication, 2014 (http://arxiv.org/abs/1409.3954). [9] C. Y. Chen and P. P. Vaidyanathan, “Compressed sensing in MIMO radar,” in 42th Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, Oct. 26-29 2008. [10] T. Strohmer and B. Friedlander, “Compressed sensing for MIMO radar - algorithms and performance,” in 43th Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, Nov. 1-4 2009. [11] Y. Yu, A. P. Petropulu, and H. V. Poor, “MIMO radar using compressive sampling,” IEEE J. Sel. Topics Signal Process., vol. 4, no. 1, pp. 146–163, 2010. [12] E. J. Candès and B. Recht, “Exact matrix completion via convex optimization,” Foundations of Computational Mathematics, vol. 9, no. 6, pp. 717–772, 2009. [13] E. J. Candès and T. Tao, “The power of convex relaxation: Near-optimal matrix completion,” IEEE Trans. Inf. Theory, vol. 56, no. 5, pp. 2053–2080, 2010. [14] E. J. Candès and Y. Plan, “Matrix completion with noise,” Proc. IEEE, vol. 98, no. 6, pp. 925–936, 2010. [15] D. S. Kalogerias and A. P. Petropulu, “Matrix completion in colocated MIMO radar: Recoverability, bounds & theoretical guarantees,” IEEE Trans. Signal Process., vol. 62, no. 2, pp. 309–321, 2014. [16] D. S. Kalogerias, S. Sun, and A. P. Petropulu, “Sparse sensing in colocated MIMO radar: A matrix completion approach,” in Proc. of IEEE 13th International Symposium on Signal Processing and Information Technology (ISSPIT), Athens, Greece, Dec. 12-15 2013. [17] Y. Yang and R. S. Blum, “MIMO radar waveform design based on mutual information and minimum mean-square error estimation,” IEEE Trans. Aerosp. Electron. Syst., vol. 43, no. 1, pp. 330–343, 2007.

January 29, 2015 DRAFT 32

[18] Y. Yang and R. S. Blum, “Minimax robust MIMO radar waveform design,” IEEE J. Sel. Topics Signal Process., vol. 1, no. 1, pp. 147–155, 2007. [19] H. He, P. Stoica, and J. Li, “Designing unimodular sequence sets with good correlations-including an application to MIMO radar,” IEEE Trans. Signal Process., vol. 57, no. 11, pp. 4391–4405, 2009. [20] K. W. Forsythe and D. W. Bliss, “MIMO radar waveform constraints for GMTI,” IEEE J. Sel. Topics Signal Process., vol. 4, no. 1, pp. 21–32, 2010. [21] H. Deng, “Polyphase code design for orthogonal netted radar systems,” IEEE Trans. Signal Process., vol. 52, no. 10, pp. 3126–3135, 2004. [22] C. Y. Chen and P. P. Vaidyanathan, “MIMO radar ambiguity properties and optimization using frequency-hopping waveforms,” IEEE Trans. Signal Process., vol. 56, no. 2, pp. 5926–5936, 2008. [23] G. S. Antonio, D. R. Fuhrmann, and F. C. Robey, “MIMO radar ambiguity functions,” IEEE Trans. Signal Process., vol. 1, no. 1, pp. 167–177, 2007. [24] S. Gogineni and A. Nehorai, “Frequency-hopping code design for MIMO radar estimation using sparse modeling,” IEEE Trans. Signal Process., vol. 60, no. 6, pp. 3022–3035, 2012. [25] Y. Yu, S. Sun, R. N. Madan, and A. P. Petropulu, “Power allocation and waveform design for the compressive sensing based MIMO radar,” IEEE Trans. Aerosp. Electron. Syst., vol. 50, no. 2, pp. 898–909, 2014. [26] A. Edelamn, T. A. Arias, and S. T. Smith, “The geometry of algorithms with orthogonality constraints,” SIAM J. Matrix Anal. Appl., vol. 20, no. 2, pp. 303–353, 1998. [27] J. H. Manton, “Optimization algorithms exploiting unitary constraints,” IEEE Trans. Signal Process., vol. 50, no. 3, pp. 635–650, 2002. [28] D. Nion and N. D. Sidiropoulos, “Tensor algebra and multi-dimensional harmonic retrieval in signal processing for MIMO radar,” IEEE Trans. Signal Process., vol. 58, no. 11, pp. 5693–5705, 2010. [29] J. F. Cai, E. J. Candès, and Z. Shen, “A singular value thresholding algorithm for matrix completion,” SIAM Journal on Optimization, vol. 20, no. 2, pp. 1956–1982, 2010. [30] P. A. Absil, R. Mathony, and R. Sepulchre, Optimization Algorithms on Matrix Manifolds, Princeton, Princeton University Press, 2008. [31] H. Zhang and W. W. Hager, “A nonmonotone line search technique and its application to unconstrained optimization,” SIAM Journal on Optimization, vol. 14, no. 4, pp. 1043–1056, 2004. [32] A. Antoniou and W.-S. Lu, Practical Optimization: Algorithms and Engineering Applications, New York, Springer, 2007. [33] M. Joho and K. Rahbar, “Joint diagonalization of correlation matrices by using Newton methods with application to blind signal separation,” in Proc. of IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM), Rosslyn, VA, June 2002. [34] J. R. Magnus and H. Neudecker, “Symmetry, 0-1 matrices and Jacobians: A review,” Econometric Theory, vol. 2, no. 2, pp. 157–190, 1986. [35] D. P. Bertsekas, Nonlinear Programming, 2nd ed., Athena Scientiﬁc, 1999. [36] M. Joho, “Newton method for joint approximate diagonalization of positive deﬁnite Hermitian matrices,” SIAM J. Matrix Anal. Appl., vol. 30, no. 3, pp. 1205–1218, 2008.

January 29, 2015 DRAFT 33

1 10 ) x ( h

M 0

β 10

−1 10 0 1 2 3 4 5 x

β (x) x x ≥ 0 M = 10 Figure 1. The value of kernel Mh as function of for with h .

January 29, 2015 DRAFT 34

S αt 2 S αt 2 S αt 2 | i ( k)| | i ( 1)| + | i ( 2)| 25 60 60

20 50 50 i i 40 15 40

30 30 10 row index row index 20 20

5 10 10

0 0 −0.5 0 0.5 0 20 40 t αk magnitudes

(a) (b) S αt 2 S αt 2 S αt 2 | i ( k)| | i ( 1)| + | i ( 2)|

4.5 60 60

4 50 50 3.5

3 i

i 40 40

2.5 30 30 2 row index row index 20 1.5 20

1 10 10 0.5

0 −0.5 0 0.5 0 5 10 t αk magnitudes

(c) (d) ( ) 2 [ ] Figure 2. (a) Snapshot power spectra, S αt , for αt ∈ − 1 , 1 with M = 40 and N = 64, corresponding to Hadamard ( ) i k ( ) k 2 2 t 2 2 waveforms; (b) Magnitude of S αt + S αt for Hadamard waveforms; (c) Snapshot power spectra corresponding to i 1 ( )i 2 ( ) 2 2 t t G-Orth waveforms; (d) Magnitude of Si α1 + Si α2 for G-Orth waveforms.

January 29, 2015 DRAFT 35

0.8 Reciprocal of SNR Hadamard 0.7 G−Orth

0.6

0.5

0.4

0.3 Relative Recovery Errors 0.2

0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 p

Figure 3. Comparison of MC error corresponding to Hadamard and G-Orth waveforms, as function of the number of data matrix entries used.

4 5 10 10 ° η=0 N=64,M =40, ∆θ=1 η t 3 =0.5 4 ° 10 10 N=64,M =40, ∆θ=5 η=1 t ° N=512,M =500, ∆θ=1 t 2 3 10 10

1 2 10 10 f f

0 1 10 10

−1 0 10 10

−2 −1 10 10

−3 −2 10 10 0 2000 4000 6000 8000 10000 0 1000 2000 3000 4000 5000 Iteration Iteration

(a) (b)

[ ◦ ◦ ◦] [ ◦ ◦] ◦ ◦ Figure 4. f versus iterations: (a) DOA space −10 : 1 : 10 ; (b) DOA space −90 : ∆θ : 90 , ∆θ = 1 , 5 .

January 29, 2015 DRAFT 36

3 10 Modified steepest descent algorithm 2 Modified Newton algorithm 10

1 10

0 10

−1 f 10

−2 10

−3 10

−4 10

−5 10 0 200 400 600 800 1000 Iteration

Figure 5. f versus iteration number under the modiﬁed steepest descent algorithm and the modiﬁed Newton algorithm. The [ ◦ ◦ ◦] waveform design parameters are Mt = 20,N = 32 and DOA space −5 : 1 : 5 .

January 29, 2015 DRAFT 37

t 2 t 2 |Si (αk)| |Si (αk)| 25 60 60 0.6265

20 50 50 0.626

0.6255 i 40 15 i 40

0.625 30 30 10 0.6245 row index row index

20 20 0.624 5 10 10 0.6235

0.623 0 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 t t αk αk

(a) (b)

t 2 t 2 |Si (αk)| |Si (αk)|

0.629 60 6 60

0.628 50 5 50 0.627

i 40 4 i 40 0.626

0.625 30 3 30

row index row index 0.624 20 2 20 0.623

10 1 10 0.622

0.621

−0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 t t αk αk

(c) (d) ( ) 2 t t Figure 6. The power spectra Si αk for αk ∈ [−0.0868, 0.0868] with Mt = 40 and N = 64 and DOA space [ ◦ ◦ ◦] −10 : 1 : 10 . (a) Hadamard waveforms; (b) Optimized waveforms using Hadamard waveforms as initialization; (c) G- Orth waveforms; (d) Optimized waveforms using G-Orth waveforms as initialization.

January 29, 2015 DRAFT 38

1 0.9 Reciprocal of SNR Reciprocal of SNR 0.9 Hadamard 0.8 Opt−Wave: On−grid G−Orth Opt−Wave: Off−grid 0.8 Opt−Wave1 Opt−Wave2 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3

Relative Recovery Errors 0.3 Relative Recovery Errors 0.2 0.2

0.1 0.1

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 p p

(a) (b)

Figure 7. MC error as function of p for Mt = 40,N = 64. (a) Comparison with Hadamard and G-Orth waveforms; (b) Off-grid cases.

0.9 10 Optimized waveforms G-Orth waveforms Bound

0.7 10

0.5 ) 10 V ( µ

0.3 10

0.1 10

0 100 200 300 400 500 Mt

Figure 8. Coherence µ (V ) and its bound under optimized as well as G-Orth waveforms for K = 4 targets located at [ ◦ ◦ ◦ ◦] −10 , −5 , 5 , 10 .

January 29, 2015 DRAFT