Introduction to Array Processing

Olivier Besson 1/122 Contents

1 Introduction Multichannel processing Array processing

2 Array processing model

3 Beamforming

4 estimation

Introduction 2/122 Context of multichannel processing Multichannel processing involves measurement vectors y(k): N samples collected (possibly at the same time) on N different • sensors, e.g., an EM wave received on N antennas:

z ~k ⋆

θ y (k) y 1 φ

y (k) x  2  . y(k) = . N samples taken on a single sensor over a time frame of NTr.   •  .  In N is the number of pulses sent by a radar each Tr  .    seconds during a coherent processing interval: y (k)  N  y1(1) b   b b  

0 Tr 2Tr 3Tr b b b y2(1) y4(1)

b

y3b(1)

Introduction Multichannel processing 3/122 Analysis/processing of multichannel measurements

Analysis of such vectors should be understood as figuring out the • relation between yn(.) and ym(.): how correlated are the signals y (.) and y (.)? • n m can y(.) be explained by a small number of variables (principal • component analysis)? Processing these vectors includes for instance linear filtering, i.e., • N

yF (k) = wn∗ yn(k) n=1 X which can be used to improve reception of a signal of interest.

Introduction Multichannel processing 4/122 A multichannel detection/estimation problem

A classical multichannel problem is to detect/estimate a known signal • a that would be present in y in addition to some noise n, i.e., decide between the two hypotheses:

: y = n H0 ( 1 : y = αa + n H and, if is declared in force, estimate α. H1 A simple way is to design a suitable linear filter w aimed at retrieving • a, and to test the energy wH y 2 at the output: | H | N y w y = n=1 wn∗ yn H1 w . 2 ≷ η P | | H0

Introduction Multichannel processing 5/122 Context of array processing

In this course, we focus on signals received on an array of N antennas • placed at different locations with a view to enhance reception of signals coming from preferred directions. z ~k ⋆

θ

y

φ

x y (k) represents the signal received at antenna number n at time • n index number k. n [1,N] is a spatial index as it is related to a particular position of • ∈ an antenna in space and one is interested in spatial processing of N yn(.), e.g. n=1 wn∗ yn(k). P Introduction Array processing 6/122 Array processing with two antennas

Consider two antennas receiving a signal s(t) emitted by a source in • the far-field s(t)

y (t) As(t t ) 1 ≃ − 0 y (t) As(t t ∆t) 2 ≃ − 0 − The time delay ∆t depends on the direction of arrival θ of s(t) and • on the relative (known) positions of the antennas: if θ is known, one can obtain s(t): spatial filtering (beamforming) • if one can estimate ∆t from y (t) and y (t), then θ follows: source • 1 2 localization.

Introduction Array processing 7/122 Beamforming with two antennas

For narrowband signals, a delay amounts to a phase shift. Hence • y (k) 1 n (k) y(k) = 1 = As(k) + 1 y (k) eiφ n (k)  2     2  Let us use a linear filter to estimate As(k). The output is • iφ w1∗y1(k) + w2∗y2(k) = As(k)[w1∗ + w2∗e ] + [w1∗n1(k) + w2∗n2(k)]

A good filter maximizes the output signal to noise ratio (SNR) which, • if n1(k) and n2(k) are uncorrelated and have same power, is given by

w + w eiφ 2 A 2P SNR = | 1∗ 2∗ | | | s w 2 + w 2 P | 1| | 2| n maximum for w = w eiφ, so that wH y(k) y (k) + y (k)e-iφ. → 2 1 ∝ 1 2

Introduction Array processing 8/122 Array of sensors

Potentialities Array of sensors offer an additional dimension (space) which enables one, possibly in conjunction with temporal or frequency filtering, to perform spatial filtering of signals: 1 source separation 2 direction finding

Fields of application 1 radar, sonar (detection, target localization, anti-jamming) 2 communications (system capacity improvement, enhanced signals reception, spatial focusing of transmissions, interference mitigation)

Introduction Array processing 9/122 Contents

1 Introduction

2 Array processing model Principle Multichannel receiver Signals received on the array Covariance matrix Model limitations

3 Beamforming

4 Direction of arrival estimation

Array processing model 10/122 Arrays and waveforms

z ~k ⋆

θ

y

φ

x

The array performs spatial sampling of a wavefront impinging from • direction(θ, φ). Assumptions: homogeneous propagation medium, source in the • far-field of the array plane wavefront. →

Array processing model Principle 11/122 Multichannel receiver

Lowpass filter thermal noise Re[˜yn(t)] Re[yn(t)] ADC ω HF cos(Nωct) snapshot filter y(kTs)

Im [˜yn(t)] Im [yn(t)] ADC

sin(ωct) ω − N Lowpass filter

Array processing model Multichannel receiver 12/122 Source signal (frequency domain)

X˘(ω)

H˘ (ω) S(ω) H˘ (ω)

ω ω ωc − c

Array processing model Signals received on the array 13/122 Signals and receiver

Source signal (narrowband)

x˘(t) = 2Re s(t)eiωct iφ(t) iω t , Re α(t)e e c n o = α(t) cos [ωct + φ(t)]

α(t) and φ(t) stand for amplitude and phase of s(t), and have slow time-variations relative to fc.

Channel response

Receive channel number n has impulse response h˘n(t).

Array processing model Signals received on the array 14/122 Model of received signals

Signal received on n-th antenna • y˘ (t) = αh˘ (t) x˘(t τ ) +n ˘ (t) n n ∗ − n n

where τn is the propagation delay to n-th sensor. In frequency domain : •

iωτn Y˘n(ω) = αH˘n(ω)X˘(ω)e− + N˘n(ω)

After demodulation (ω ω + ω ) and lowpass filtering: • → c

i(ω+ωc)τn Yn(ω) = αH˘n(ω + ωc)S(ω)e− + N˘n(ω + ωc)

iωcτn αH˘ (ω )S(ω)e− + N˘ (ω + ω ) ' n c n c

Array processing model Signals received on the array 15/122 Model of received signals Taking the inverse Fourier transform 1 (Y (ω)) yields • F − n

iωcτn y (t) αH˘ (ω )s(t)e− + n (t) n ' n c n The snapshot writes •

˘ iωcτ1 y1(t) H1(ωc)e− n1(t iωcτ2 y2(t) H˘2(ωc)e− n2(t) y(t) =  .  = α  .  s(t) +  .  . . .    iω τ    yN (t) H˘ (ω )e c N  nN (t)    N c −          Assuming all H˘ (ω ) are identical and absorbing α and H˘ (ω ) in • n c n c s(t), we simply write

y(t) = a(θ)s(t) + n(t)

Array processing model Signals received on the array 16/122 Model of received signals The snapshot is then sampled (temporally) at rate T to obtain the • s N K data matrix Y = y(1) y(2) ... y(K) : | Ts 2Ts KTs   Time 1 y1(Ts) y1(2Ts) y1(KTs)

2 y2(Ts) y2(2Ts) y2(KTs)

. . . . . Y .

N yN (Ts) yN (2Ts) yN (KTs) Space y(1) y(2) y(K) The k-th snapshot is given by • y(k) = a(θ)s(k) + n(k) where a(θ) is the vector of phase shifts, referred to as the steering vector since τn depends only on the directions(s) of arrival of the source. Array processing model Signals received on the array 17/122 Model of received signals

Snapshot at time index k The snapshot received in the presence of P sources is given by

P y(k) = a(θp)sp(k) + n(k) p=1 X s1(k) . = a(θ1) ... a(θP )  .  + n(k) s (k)    P  N P   = A(|θ)s(k) + n(k) P 1 |

Array processing model Signals received on the array 18/122 Steering vector z ~k ⋆

θ

y

φ

x 1 τ = [x cos θ cos φ + y cos θ sin φ + z sin θ] n c n n n 2π i [xn cos θ cos φ+yn cos θ sin φ+zn sin θ] an(θ, φ) = e λ

Array processing model Signals received on the array 19/122 Uniform linear array (ULA) Steering vector

θ d sin θ

1 2 3 N d

i2πf i2π(N 1)f T d sin θ d a(θ) = 1 e s e s ; fs = fc = sin θ ··· − c λ   Spatial sampling requirement For the phase shift ∆φ = 2π d sin θ to be within [ π, π] for every λ − θ [ π/2, π/2] one needs to have ∈ − λ d ≤ 2

Array processing model Signals received on the array 20/122 Covariance matrix

Definition The covariance matrix is defined as

H R = E y(k)y (k)  y1(k) y (k)  2   = E . y1∗(k) y2∗(k) . . . yN∗ (k)  .     y (k)    N       Interpretation

The (n,` ) entry R(n,` ) = E yn(k)y∗(k) measures the correlation { ` } between signals received at sensors n and `, at the same time index k.

Array processing model Covariance matrix 21/122 Structure of the covariance matrix Signals covariance matrix The covariance matrix of the signal component is

H H H H E A(θ)s(k)s (k)A (θ) = A(θ)RsA (θ)(Rs = E s(k)s (k) ) P  H  = Ppa(θp)a (θp)(uncorrelated signals) p=1 X

Provided that Rs is full-rank (non coherent signals), the signal covariance matrix has rank P and its range space is spanned by the steering vectors a(θ ), p = 1, ,P . p ··· Noise covariance matrix Assuming spatially white noise (i.e., uncorrelated between channels) with same power on each channel, E n(k)nH (k) = σ2I.

Array processing model Covariance matrix  22/122 Model limitations

y(k) = a(θ)s(k) + n(k) is an idealized model of the signals received on the array. It does not account for: a possibly non homogeneous propagation medium which results in • coherence loss and wavefront distortions. This leads to amplitude and phase variations along the array, i.e. iφn(k) yn(k) = gn(k)e an(θ)s(k) + nn(k). uncalibrated arrays, i.e., different amplitude and phase responses for • each channel. wideband signals for which a time delay does not amount to a simple • phase shift. In the frequency domain, one has y(f) = af (θ)s(f) + n(f) with T a (θ) = 1 e i2πfτ(θ) e i2πf(N 1)τ(θ) . f − ··· − − possibly colored reception noise, i.e. E n(k)nH (k) = σ2I. •   6  Array processing model Model limitations 23/122 Contents

1 Introduction

2 Array processing model

3 Beamforming Principle Array beampattern Spatial filtering Adaptive beamforming Robust adaptive beamforming Partially adaptive beamforming Summary

4 Direction of arrival estimation

Beamforming 24/122 Spatial filtering Principle: use a linear combination of the sensors outputs in order to point towards a looked direction. s(t) i(t)

y1(k) y2(k) yN (k)

w1∗ w2∗ wN∗ N N N Σ

N y (k)= w∗ y (k) as(k) F n=1 n n ≃

Beamforming Principle P 25/122 Array beampattern

For any weight vector w, the corresponding array beampattern (gain • at direction θ) is defined as a(θ) s(k) wH a(θ) s(k) 2 w = G (θ) = g (θ) 2 = wH a(θ) ⇒ w | w | For a uniform linear array, the natural beampattern, obtained as a • simple sum (wn = 1) of the sensors outputs, is given by

N 1 d − d d sin πN sin θ i2πn λ sin θ iπ(N 1) λ sin θ λ g(θ) = e = e − d n=0 sin π λ sin θ  X   2 sin πN d sin θ G(θ) = g(θ) 2 = λ | | sin π d sin θ  λ 

 

Beamforming Array beampattern 26/122 ULA beampattern

Beampattern of the uniform linear array 30 N=6, d=0.5λ N=6, d=2λ 20 N=12, d=0.5λ

10

0

−10 dB

−20

−30

−40

−50 −80 −60 −40 −20 0 20 40 60 80 Angle of arrival

0.9λ θ dB 3 ' N d

Beamforming Array beampattern 27/122 Windowing

ULA beampattern with windowing 10 Rect Cheb 30dB 0 Cheb 50dB

−10

−20

−30 dB

−40

−50

−60

−70 −80 −60 −40 −20 0 20 40 60 80 Angle of arrival

Beamforming Array beampattern 28/122 Beamforming Objective Focus on a given direction in order to enhance reception of the signals impinging from this direction, and possibly mitigate interference located at other directions.

Principle

Each sensor output is weighted by wn∗ before summation:

N H y (k) = w∗ y (k) = w w w y(k) = w y(k). F n n 1∗ 2∗ ··· N∗ n=1 X   Question How to choose w such that, if y(k) = a(θ )s(k) + then at the output s ··· y (k) αs(k)? F '

Beamforming Spatial filtering 29/122 Conventional beamforming Conventional beamforming: w a(θ ) ∝ s

H yF (k) = a (θs)a(θs)s(k)[w = a(θs), 1 source at θs] N 1 − d d i2π n sin θs +i2π n sin θs = e− λ e λ s(k) × n=0 X N 1 − = s(k) = Ns(k) n=0 X so that the gain towards θs is maximal and equal to N. The beamfomer H wCBF = a(θs)/[a (θs)a(θs)] is referred to as the conventional beamformer.

Principle One compensates for the phase shift induced by propagation from direction θs and then sum coherently. Beamforming Spatial filtering 30/122 Array beampattern with conventional beamforming

Conventional beamformer 10 ◦ θs = 0 ◦ θs = 30 ◦ θs = −45

0

−10

−20 dB

−30

−40

−50

−60 −80 −60 −40 −20 0 20 40 60 80 Angle of arrival

Beamforming Spatial filtering 31/122 SNR improvement Before beamforming E s(k) 2 P y(k) = ass(k) + n(k); SNRin , | | 2 = 2 E nn(k) σ {| | } After beamforming

H H H yF (k) = w y(k) = w ass(k) + w n(k) H 2 w as 2 SNRout = | | SNRin a SNRin = N SNRin w 2 ≤ k sk × k k with equality if w a . ∝ s White noise array gain H For any w such that w as = 1, the white noise array gain is 2 AWN = SNRarray/SNRelem = w − N. Beamforming Spatial filtering k k ≤ 32/122 Conventional beamforming versus adaptive beamforming

Conventional beamforming The conventional beamformer is optimal in white noise: it amounts to minimize wH w (the output power in white noise) under the constraint H w a(θs) = 1. Any other direction is deemed to be equivalent it does not take into account other signals (interference) present in some⇒ directions.

Adaptive beamforming Adaptive beamforming takes into account these other signals. It consists 2 in minimizing the output power E wH y(k) while maintaining a unit gain towards looked directionn tends too place nulls towards ⇒ interfering signals.

Beamforming Adaptive beamforming 33/122 Adaptive beamforming

Beamforming-filtering in the presence of interference The received (input) signal in the presence of interference and noise is • given by y(k) = ass(k) + yI (k) + n(k)

where as is the actual SoI steering vector. The output of the beamformer contains the same (albeit filtered) • components:

H H y(k)= ass(k)+ yI (k)+ n(k) w a s(k)+ w [y (k)+ n(k)] w s I input signal interference+noise

Beamforming Adaptive beamforming 34/122 Signal to interference plus noise ratio (SINR)

Definition of SINR For a given beamformer w, the usual figure of merit is the signal to interference plus noise ratio (SINR), defined as

H 2 E w ass(k) SINR(w) = nH o 2 E w [yI (k) + n (k)] | | n H 2 o Ps w as = H w Cw

H where C = E [yI (k) + n(k)] [yI (k) + n(k)] stands for the interferencen plus noise covariance matrix. o

Beamforming Adaptive beamforming 35/122 Optimal beamformer: SINR maximization Optimal beamformer

Maximize SINR while ensuring a unit gain towards as:

H H min w Cw subject to w as = 1 (optimal) w

1 C− as H 1 wopt = H 1 SINRopt = Psas C− as as C− as →

Remarks Principle is to minimize output power (when input = y + n) under • I the constraint that the actual steering vector as goes non distorted. Neither a nor C will be known in practice: the actual steering vector • s may be different from its expected value and C needs to be estimated from data (which contain yI + n).

Beamforming Adaptive beamforming 36/122 Minimum Variance Distortionless Response (MVDR)

Principle

Minimize output power (when input = yI + n) under the constraint that the assumed steering vector goes non distorted.

Minimization problem and solution

H H min w Cw subject to w a0 = 1 (MVDR) w

where a0 is the assumed steering vector of the signal of interest (SoI). The solution is given by

C 1a w = − 0 MVDR H 1 a0 C− a0

Beamforming Adaptive beamforming 37/122 Minimum Power Distortionless Response (MPDR)

Principle

Minimize output power (when input = ass + yI + n) under the constraint that the assumed steering vector goes non distorted:

H H min w Rw subject to w a0 = 1 (MPDR) w

H where R(= C + Psasas ) stands for the signal plus interference plus noise covariance matrix.

Solution R 1a w = − 0 MPDR H 1 a0 R− a0

Beamforming Adaptive beamforming 38/122 Summary of adaptive beamformers (known covariance matrices)

Beamformer Principle Weight vector

−1 H H C as Optimal min w Cw s.t. w as = 1 wopt = H −1 w as C as output power gain constraint

−1 | H{z } |H {z } C a0 MVDR min w Cw s.t. w a0 = 1 wMVDR = H −1 w a0 C a0

−1 H H R a0 MPDR min w Rw s.t. w a0 = 1 wMPDR = H −1 w a0 R a0

a = actual steering vector and a = assumed steering vector • s 0 C = cov(y + n) and R = cov(a s + y + n) • I s I

Beamforming Adaptive beamforming 39/122 CBF and optimal (MVDR) beampatterns

Comparison CBF−MVDR 0 CBF MVDR −10

−20

−30 dB −40

−50

−60

−70 −80 −60 −40 −20 0 20 40 60 80 Angle of arrival

Beamforming Adaptive beamforming 40/122 CBF vs MVDR: the case of a single interference

Derivation of SINR In the case C = P a aH + σ2I with INR = Pj 1, it can be shown that j j j σ2  P 1 P SINR s ; SINR s N(1 g) CBF ' σ2 × g INR opt ' σ2 × − × with g = cos2 (a , a ) = aH a 2/(aH a )(aH a ). s j | s j| s s j j Remarks With CBF, the SINR decreases when P increases while it is • j independent of Pj with adaptive beamforming. The SINR decreases when a a (g 1). • j → s →

Beamforming Adaptive beamforming 41/122 CBF vs MVDR: when is the latter useful? From Kantorovich’s inequality • H 1 H 2 SINR(wopt) (as C− as)(as Cas) (λmin(C) + λmax(C)) 1 = H 2 ≤ SINR(wCBF) (as as) ≤ 4λmin(C)λmax(C)

Maximum value of SINR(wopt)/SINR(wCBF ) 25

20

15 decibels 10

5

0 0 5 10 15 20 25 30

Value of λmax(C)/λmin(C) (decibels) Adaptive beamforming is adequate if λ (C)/λ (C) 1. • max min  Beamforming Adaptive beamforming 42/122 Generalized Sidelobe Canceler

The GSC structure can be represented as • s(t)

i2(t) i1(t)

s(k) d(k) i (k) 1 n1(k) a0 a0s(k) aH a 0 0 y(k) i(k) main channel n(k) + d(k) wH z(k) − a auxiliary channel + i (k) auxiliary channel z(k) 2 n (k) − 2 wa B

N N 1 N 1 1 | − − | where the (N 1) columns of B form a basis of the subspace − H orthogonal to a0, i.e., B a0 = 0.. The (N 1) auxiliary channels z(k) are free of signal and enable one • − to infer the part of interference that went through the CBF. w enables one to estimate, from z(k), the part of interference i (k) • a 1 contained in d(k) since i1(k) is correlated with z(k) through i2(k).

Beamforming Adaptive beamforming 43/122 Generalized Sidelobe Canceler The GSC structure decomposes w into a component along a and a • 0 component orthogonal to a0, i.e., w = αa0 w : − ⊥

a0 s(k) w d(k) i (k) 1 n1(k) a0 a0s(k) aH a 0 0 y(k) i(k) αa 0 n(k) + d(k) wH z(k) − a + i (k) z(k) 2 b1 n (k) − 2 wa B w N N 1 N 1 1 | − − | ⊥ b2 The component along a ensures that the constraint is fulfilled since • 0 H H H H H 1 w a0 = α∗a0 a0 w a0 = α∗a0 a0 + 0 α = a0 a0 − − ⊥ ⇒ The orthogonal component w = Bw is chosen to minimize  output • a power, in an unconstrained way.⊥

Beamforming Adaptive beamforming 44/122 Generalized Sidelobe Canceler

Minimization of the output power can be achieved by solving one of • the two following equivalent problems:

H H min w Cw min (wCBF Bwa) C (wCBF Bwa) H w a0=1 wa − − direct form, constrained GSC form, unconstrained

The MVDR beamformer in its GSC form is given by •

wGSC = wCBF Bw∗ − a

where wa∗ solves the above minimization problem.

Beamforming Adaptive beamforming 45/122 Generalized Sidelobe Canceler

Derivation of wa∗ The power at the output of the beamformer is given by • H 2 2 H H H E d(k) w z(k) = E d(k) w rdz r wa + w Rzwa − a | | − a − dz a n o n o1 H 1 = w R− r R w R− r a − z dz z a − z dz 2 H 1 + E d(k) r R−rdz  | | − dz z n o H with rdz = E z(k)d∗(k) and Rz = E z(k)z(k) . { } The weight vector which minimizes output power is thus •  1 wa∗ = Rz− rdz

Beamforming Adaptive beamforming 46/122 Generalized Sidelobe Canceler

The GSC form of the weight vector is given by • 1 wGSC = wCBF BR− r − z dz 1 = w B BH R B − BH R w (GSC) CBF − y y CBF  where Ry = R in a MPDR scenario and Ry = C in a MVDR scenario. 1 Since they solve the same problem w = aH R 1a − R 1a . • GSC 0 y− 0 y− 0 The SINR is inversely proportional to the output power when •  Ry = C, i.e.,

H H 1 1 SINRGSC = P w CwCBF r R− r − s CBF − dz z dz  

Beamforming Adaptive beamforming 47/122 MVDR versus MPDR The optimal, MVDR and MPDR beamformers are equivalent if and only if

H H H min w C + Psasa w subject to w a0 = 1 (MPDR) w s H H min w Cw subject tow a0 = 1 (MVDR) ≡ w H H min w Cw subject to w as = 1 (opt) ≡ w which is true only when the 2 following conditions are satisfied:

1 the assumed steering vector a0 coincides with the actual steering vector as: in practice, uncalibrated arrays or a pointing error lead to a = a ; 0 6 s 2 the covariance matrix R is known: in practice, one needs to estimate it which results in estimation errors Rˆ R. − = It ensues that degradation compared to SINR is unavoidable in ⇒ opt practice, and it can be quite different between MPDR and MVDR.

Beamforming Adaptive beamforming 48/122 Influence of a steering vector error (MVDR)

We assume that the SoI steering vector is a0 while it is actually as. • 1 The SINR obtained with w = aH C 1a − C 1a becomes • MVDR 0 − 0 − 0

H 2  H 1 2 Ps w as a C− as SINR = MVDR = P 0 MVDR H s H 1 wMVDR CwMVDR a0 C− a0 H 1 2 a C− as = SINR 0 opt H 1 H 1 × (a0 C − a0)(as C − as) 2 1 = SINRopt cos a , a ; C− × s 0 SINR ≤ opt 

Beamforming Adaptive beamforming 49/122 Influence of a steering vector error (MPDR)

The MPDR beamformer can be written as • R 1a w = − 0 ; R = P a aH + C MPDR H 1 s s s a0 R− a0 Its SINR is decreased compared to that of the MVDR, viz • SINR SINR = MVDR MPDR 2 2 1 1 + 2SINRopt + SINRopt sin (as, a0; C− ) SINR . ≤ MVDR  The degradation is more important as SINR (hence P ) increases. • opt s

Beamforming Adaptive beamforming 50/122 Influence of a steering vector error on beampatterns

Beampatterns with pointing errors 10 opt ◦ MVDR θ0 − θs = 2 MPDR 0

−10

−20 dB

−30

−40

−50

−60 −80 −60 −40 −20 0 20 40 60 80 Angle of arrival

Beamforming Adaptive beamforming 51/122 Influence of a steering vector error on SINR and WNAG

0 10

9 -5 MVDR MPDR MVDR MPDR 8 -10

7

-15 6

-20 5

-25 4 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5

Beamforming Adaptive beamforming 52/122 Case of an uncalibrated array

Let us consider an uncalibrated array with actual steering vector •

iφn a˜n(θ) = (1 + gn)e an(θ)

where g and φ are independent random gains and phases. { n} { n} For any beamformer w, the average value of the resulting • beampattern G˜ (θ) = wH a˜(θ) 2 is related to the nominal w | | beampattern G (θ) = wH a(θ) 2 through w | | ˜ 2 2 2 2 E Gw(θ) = γ Gw(θ) + 1 + σ γ w | | g − | | k k n o h i 2 2 iφn where σ = E gn and γ = E e . g | | The term proportional to w 2 leads to sidelobe level increase •  k k  ⇒ better to have high white noise array gain (small w 2 ). k k

Beamforming Adaptive beamforming 53/122 Influence of a finite number of snapshots

In practice, K snapshots are available: •

yi+n(k)

y(k) = ass(k) + yI (k) + n(k); k = 1,...,K z }| { The covariance matrices are thus estimated and subsequently one can • compute the corresponding beamformers as

K 1 Rˆ 1a Rˆ = y(k)yH (k) wsmi = − 0 K MPDR H ˆ 1 −→ a0 R− a0 Xk=1 K ˆ 1 1 H smi C− a0 Cˆ = yi+n(k)y (k) w = K i+n MVDR H ˆ 1 −→ a0 C− a0 Xk=1 where smi stands for “sample matrix inversion”.

Beamforming Adaptive beamforming 54/122 Influence of a finite number of snapshots

The sample beamformers wsmi will differ from their ensemble • M-DR counterparts wM-DR since Rˆ = R + ∆R and Cˆ = C + ∆C. The weight vectors wsmi are random and so are their corresponding • M-DR signal to noise ratios

H ˆ 1 2 smi a0 R− as SINR (w ) = Ps | | MPDR H ˆ 1 ˆ 1 a0 R− C R− a0 H ˆ 1 2 smi a0 C− as SINR (w ) = Ps | | MVDR H ˆ 1 ˆ 1 a0 C− C C− a0 Important issue is speed of convergence, i.e., how large should K • smi smi be for E SINR (wMPDR) or E SINR (wMVDR) to be “close” to { } { } SINRopt?

Beamforming Adaptive beamforming 55/122 SINR loss with finite number of snapshots (MVDR)

When a = a , the SINR loss of the MVDR beamformer can be • 0 s represented as

1 smi 2 − SINR (w ) d χ2(N 1)(0) MVDR − ρMVDR = = 1 + 2 SINR (wopt) " χ2(K N+2)(0)# − and thus follows a beta distribution, i.e.,

Γ(K + 1) K N+1 N 2 pMVDR(ρ) = ρ − (1 ρ) − Γ(K N + 2)Γ(N 1) − − −

The expected value is E ρMVDR = (K N + 2)/(K + 1), so that • smi { } − SINR (wMVDR) is (on average) within 3dB of the optimal SINR for K = 2N 3. MVDR −

Beamforming Adaptive beamforming 56/122 SINR loss with finite number of snapshots (MVDR)

7

6

5

4

3

2

1

0 -30 -25 -20 -15 -10 -5 0

Beamforming Adaptive beamforming 57/122 SINR loss with finite number of snapshots (MPDR)

As for the MPDR scenario one has • 1 smi 2 − SINR (w ) d χ2(N 1)(0) MPDR − ρMPDR = = 1 + (1 + SINRopt) 2 SINR (wopt) " χ2(K N+2)(0)# − The distribution of ρ is related to that of ρ as follows: • MPDR MVDR K N+2 (1 + SINRopt) − pMPDR(ρ) = pMVDR(ρ) K+1 × (1 + ρSINRopt) The average number of snapshots to achieve the optimal SINR within • 3dB is about K (N 1) [1 + SINR ] MPDR ' − opt where SINR N Ps . In general, K K . opt ' σ2 MPDR  MVDR 

Beamforming Adaptive beamforming 58/122 Beampatterns with finite number of snapshots

Beampatterns CBF−MPDR−MVDR 10 MVDR MPDR 0 CBF

−10

−20 dB

−30

−40 N=10, K=20

−50 −80 −60 −40 −20 0 20 40 60 80 Angle of arrival

Beamforming Adaptive beamforming 59/122 Distribution of SINR loss

7

6

5

4

3

2

1

0 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 0

Beamforming Adaptive beamforming 60/122 SINR versus number of snapshots

SINR versus K 10

8

6

4

2 dB 0 optimum

−2 MVDR MPDR −4 CBF

−6

−8 10 20 30 40 50 60 70 80 90 100 Number of snapshots

Beamforming Adaptive beamforming 61/122 How to make MPDR more robust?

Observations Estimation of covariance matrices leads to a significant SINR loss • (especially for the MPDR beamformer) due to I the interference being less eliminated I a sidelobe level increase which results in a lower white noise gain. In case of uncalibrated arrays, steering vector errors are all the more • emphasized that the white noise gain is low (or w 2 large). k k

Beamforming Robust adaptive beamforming 62/122 How to make MPDR more robust? White noise array gain and SINR

SINR versus K White noise gain versus K 10 10

8 8

6 6

4 4 2 2 dB dB 0 optimum 0 −2 MVDR MVDR MPDR MPDR −2 −4 CBF

−6 −4

−8 −6 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 Number of snapshots Number of snapshots

Observation: similarity between the two curves.

A possible remedy Enforce a minimal white noise array gain or equivalently restrain w 2 k k in order to make the MPDR beamformer more robust.

Beamforming Robust adaptive beamforming 63/122 Diagonal loading

Principle One tries to solve

H H 2 1 1 min w Rwˆ subject to w a0 = 1 and w = A− ( N − ) w k k WN ≥

Solution The Lagrangian is given by (with λ C and µ R) ∈ ∈ H H H H 1 L(w, λ, µ) = w Rwˆ + λ w a 1 + λ∗ a w 1 + µ w w A− 0 − 0 − − WN 1 H 1 −   −  = w + λ Rˆ + µI a0 Rˆ + µI w + λ Rˆ + µI a0        1   1 2 H − λ λ∗ µA− λ a Rˆ + µI a . − − − WN − | | 0 0  

Beamforming Robust adaptive beamforming 64/122 Diagonal loading

Solution 1 The solution thus takes the form w = λ Rˆ + µI − a . Since MPDR-DL − 0 H wMPDR-DLa0 = 1, it follows that  

1 − Rˆ + µI a0 wMPDR-DL = 1 H ˆ  − a0 R + µI a0   2 and µ is selected such that w − = A . k MPDR-DLk WN

Beamforming Robust adaptive beamforming 65/122 Diagonal loading : adaptivity versus robustness

N CBF µ

µ

MPDR CBF MPDR

White noise array gain Diagonal loading µ

Beamforming Robust adaptive beamforming 66/122 Choice of loading level Many different possibilities have been proposed to set the loading level: 2 set A (slightly below N) and compute µ from w − = A . • WN k MPDR-DLk WN set µ directly, generally a few decibels above white noise level (see • discussion next slide about beampatterns and eigenvalues). set µ using the theory of ridge regression, which enables one to • compute µ from data. use that diagonal loading is the solution to the following problem • H 2 2 max Rˆ P aa for a a0 ε P,a − k − k ≤ and compute µ from ε. set A and compute directly the diagonally loaded beamformer in • WN GSC form without necessarily computing µ. ... •

Beamforming Robust adaptive beamforming 67/122 An interpretation of diagonal loading and the choice of µ

The array beampattern with the true covariance matrix is given by • J α λ g(θ) = aH a(θ) n aH u uH a(θ) σ2 0 − λ + σ2 0 n n ( n=1 n ) X   The array beampattern with an estimated covariance matrix becomes • N α λˆ gsmi(θ) = aH a(θ) n aH uˆ uˆH a(θ) ˆ 0 − ˆ ˆ 0 n n λmin ( n=1 λn + λmin ) X  

Degradation is due to λˆ = λˆ = λˆ = λˆmin. • J+1 6 J+2 6 ··· N Replacing Rˆ by Rˆ + µI enables one to equalize the eigenvalues, • provided that µ σ2 and µ < λ .  J

Beamforming Robust adaptive beamforming 68/122 Diagonal loading: SINR versus number of snapshots

SINR versus K − Diagonal loading 10

8

6

4

2 dB 0 optimum −2 MPDR MPDR−DL=5dB −4 MPDR−DL=10dB

−6

−8 10 20 30 40 50 60 70 80 90 100 Number of snapshots

Beamforming Robust adaptive beamforming 69/122 Diagonal loading: beampatterns

Beampattern of MPDR with diagonal loading 10

0

−10

−20 dB −30

−40 MPDR MPDR−DL=5dB −50 N=10, K=20 MPDR−DL=10dB

−60 −100 −80 −60 −40 −20 0 20 40 60 80 100 Angle of arrival

Beamforming Robust adaptive beamforming 70/122 Influence of the loading level on SINR and WNAG

SINR versus diagonal loading level White noise gain versus diagonal loading level 10 10 K=20 K=200 9 8 8

6 7

6 4 dB dB 5

2 4

3 K=20 0 K=200 2

−2 1 −20 −15 −10 −5 0 5 10 15 20 25 30 −20 −15 −10 −5 0 5 10 15 20 25 30 Diagonal loading level (relative to σ2) Diagonal loading level (relative to σ2)

Beamforming Robust adaptive beamforming 71/122 Linearly constrained beamforming

To mitigate pointing errors, one can resort to multiple constraints, i.e. • solve the problem

min wH Cw subject to ZH w = d

1 H 1 1 whose solution is w = C− Z Z C− Z − d. One can use a unit gain constraint around the presumed DOA or a •  smoothness constraint:

T Z = a(θ ) a(θ + δ ) a(θ + δ ) d = 1 1 1 0 0 1 ··· 0 L ···  ∂a(θ) ∂La(θ)   T Z = a(θ0) ∂θ ∂θL d = 1 0 0 θ0 ··· θ0 ···

h i  

Beamforming Robust adaptive beamforming 72/122 Partially adaptive beamforming Principle Perform beamforming in a lower dimensional subspace.

Structures Direct form: • y(k) y˜(k) w˜ H y˜(k) T w˜

N (R + 1) (R + 1) 1 | | GSC form (T = w BU ): • CBF d(k)  wCBF 

y(k) + d(k) w˜ H z˜(k) + − a

z(k) z˜(k) − B U w˜ a

N N 1 N 1 R R 1 | − − | | Beamforming Partially adaptive beamforming 73/122 Motivation for partially adaptive beamforming

J H 2 The optimal beamformer when C = j=1 Pjajaj + σ I With this low-rank + scaled identity matrix form, one has • P J N 2 H 2 H C = λn + σ unun + σ unun n=1 n=J+1 X  X where u , , u = a , , a . R{ 1 ··· J } R{ 1 ··· J } The optimal beamformer w C 1a can then be rewritten as • opt ∝ − s J λ w w n uH w u opt ∝ CBF − λ + σ2 n CBF n n=1 n X  

Beamforming Partially adaptive beamforming 74/122 Motivation for partially adaptive beamforming

J H 2 Interpretation of wopt when C = j=1 Pjajaj + σ I The optimal beamformer amounts to subtract from the CBF a linear • P combination of the J principal eigenvectors of C:

wCBF

+ H y(k) + wopty(k) λ1 H ⊗ 2 u wCBF λ1+σ 1 − . u1 u  .  α ··· J .  λJ H     2 uJ wCBF  λJ +σ  N J   | J 1 | The optimal beamformer is a partially adaptive beamformer. • The “eigenbeams” are steered towards interference in order to • suppress them from the output of the conventional beamformer.

Beamforming Partially adaptive beamforming 75/122 Beampatterns (CBF and eigenvectors)

CBF 1st eigen−beam 0 0

−20 −20

−40 −40

−60 −60 −90 −60 −30 0 30 60 90 −90 −60 −30 0 30 60 90

2nd eigen−beam MVDR 0 0

−20 −20

−40 −40

−60 −60 −90 −60 −30 0 30 60 90 −90 −60 −30 0 30 60 90

Beamforming Partially adaptive beamforming 76/122 Beampatterns (CBF and eigenvectors)

CBF 1st eigen−beam 0 0

−20 −20

−40 −40

−60 −60 −90 −60 −30 0 30 60 90 −90 −60 −30 0 30 60 90

2nd eigen−beam MVDR 0 0

−20 −20

−40 −40

−60 −60 −90 −60 −30 0 30 60 90 −90 −60 −30 0 30 60 90

Beamforming Partially adaptive beamforming 77/122 Expression of the partially adaptive beamformer

Direct form

y(k) y˜(k) w˜ H y˜(k) T w˜

N (R + 1) (R + 1) 1 | |

Minimization of the output power • H H min w˜ Ry˜w˜ subject to w˜ a˜0 = 1 (PA-DF) w˜

H H where Ry˜ = T RyT and a˜0 = T a0. The solution is given by • 1 1 w˜ = αR− a˜ wPA-DF = αTR− a˜ y˜ 0 ⇒ y˜ 0

Beamforming Partially adaptive beamforming 78/122 Expression of the partially adaptive beamformer GSC form

d(k) wCBF

y(k) + d(k) w˜ H z˜(k) + − a

z(k) z˜(k) − B U w˜ a

N N 1 N 1 R R 1 | − − | |

Minimization of the output power [z˜(k) = UH z(k)] • H 2 min E d(k) w˜ a z˜(k) (PA-GSC) w˜ a − n o The solution is given by • 1 1 H − H w˜ a = Rz−˜ rdz˜ = U RzU U rdz 1 wPA-GSC = wCBF BUR− r − z˜ dz˜ 

Beamforming Partially adaptive beamforming 79/122

Optimality of the partially adaptive beamformer (a0 = as)

Question: can we possibly have wPA-DF = wopt? Answer is yes: w = w C 1a T PA-DF opt ⇔ − s ∈ R { } Comments At first glance, meaningless condition : if C 1a were known, we • − s would get wopt and hence no need for T. But if C = J P a aH + σ2I then C 1a = η a + J η a . • j=1 j j j − s s s j=1 j j if a a ... a T then w = w . • ⇒ s P1 J ∈ R { } PA-DF opt P   GSC form w = w BH C 1a U : if BH a ... BH a U PA-GSC opt ⇔ − s ∈ R { } 1 J ∈ R { } then w = w . PA-GSC opt  

Beamforming Partially adaptive beamforming 80/122 Analysis of the partially adaptive MVDR

SINR loss for fixed T (Ry = C, a0 = as) 1 The SINR loss of the partially adaptive beamformer w = Tw˜ = TC˜ − a˜0 with fixed T is distributed according to

1 2 − d χ2R(0) ρPA-MVDR = a 1 + 2 " χ2(K R+1)(0)# − where

a˜H C˜ 1a˜ aH T(TH CT) 1TH a a = 0 − 0 = 0 − 0 H 1 H 1 a0 C− a0 a0 C− a0 1/2 1/2 energy of C− a0 in C T = R 1 1/2 energy of C− a0 ≤

Beamforming Partially adaptive beamforming 81/122 Analysis of the partially adaptive MVDR

16 35

14 30

12 25

10 20 8 15 6

10 4

2 5

0 0 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0

partially adaptive beamforming is potentially very effective in low ⇒ sample support, provided that T is well chosen.

Beamforming Partially adaptive beamforming 82/122 Selection of matrices T and U

Fixed transformations For instance using pre-steered beams, i.e. • T = a(θ˜ ) a(θ˜ ) a(θ˜ ) 1 2 ··· R+1 H ˜ ˜ ˜ U = B a(θ1) a(θ2) a(θR) ··· In this case, the columns of U can be viewed as beamformers aimed • at intercepting the interference. Require some prior knowledge about the interference DOA in order • for them to pass through the beams.

Beamforming Partially adaptive beamforming 83/122 Value of a with pre-steered beams

1 1

0.99 0.9

0.98 0.8

0.97 0.7

0.96 0.6

0.95 0.5

0.94 0.4 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100

H ˜ Case of 2 interferences located at θ1, θ2. Value of a when U = B a(θ˜1) a(θ˜2) and θi drawn randomly around θ . i  

Beamforming Partially adaptive beamforming 84/122 Selection of matrices T and U Random transformations The ideaa is to use L matrices U drawn from a uniform distribution • ` on the manifold of semi-unitary (N 1) R matrices, i.e. − × H H/2 d U` = X` X` X` − ; X` = C (0, IN 1, IR) N −  and to average the corresponding weight vectors w˜ `, yielding

L 1 H 1 H w = wCBF B U` U` RzU` − U` rdz − "L # X`=1 L  1 1 = w B X XH R X − XH r CBF − L ` ` z ` ` dz " `=1 # X  a T. Marzetta, G. Tucci, S. Simon, “A random matrix-theoretic approach to handling singular covariance matrices”, IEEE Transactions Information Theory, September 2011

Beamforming Partially adaptive beamforming 85/122 Marzetta’s method based on random U

d(k) wCBF d(k) 1 L w˜ H z˜ (k) y(k) + − L ℓ=1 ℓ ℓ + P − z˜1(k) U1 w˜ 1

z(k) z˜2(k) mean B U w˜ • 2 2 value . . . .

z˜L(k) UL w˜ L

The matrices U` are drawn from a uniform distribution on the manifold of semi-unitary matrices, or from a Gaussian distribution C (0, IN 1, IR). N −

Beamforming Partially adaptive beamforming 86/122 Selection of matrices T and U

Adaptive transformations Matrices T or U depend on the snapshots. For example, in GSC form, if

N 1 − H Rz = λnunun ; λ1 λ2 λN 1 ≥ ≥ · · · ≥ − n=1 X one can choose

I the R principal eigenvectors of Rz (Principal Component), i.e.

1 H U = u u wpc-gsc = wCBF BUΛ− U r 1 ··· R ⇒ − dz where Λ = diag λ , , λ . { 1 ··· R} I the R eigenvectors which contribute most to increasing the SINR (Cross Spectral Metric).

Beamforming Partially adaptive beamforming 87/122 Partially adaptive beamforming: SINR versus K

SINR of partially adaptive beamformers 10

8

6

4

2 dB 0

−2 optimum MPDR −4 MPDR−PA [−20 15] MPDR−PA [−30 20] −6 PC

−8 10 20 30 40 50 60 70 80 90 100 Number of snapshots

Beamforming Partially adaptive beamforming 88/122 Partially adaptive beamforming: SINR versus K

SINR of PC and CSM beamformers 10

9

8

7

6 dB

5

optimum 4 MVDR PC 3 R=J=2 CSM

2 10 20 30 40 50 60 70 80 90 100 Number of snapshots

Beamforming Partially adaptive beamforming 89/122 Partially adaptive beamforming: SINR versus R

SINR of PC and CSM beamformers 10

5 dB

0 optimum MVDR PC K=20 CSM

−5 1 2 3 4 5 6 7 8 9 Rank of the transformation

Beamforming Partially adaptive beamforming 90/122 Beamforming: synthesis

1 Conventional beamforming w = aH a − a . Optimal in white • CBF 0 0 0 d 1 noise, θ dB = 0.9 N − , sidelobes at 13dB. 3 λ −  1 1 Adaptive beamforming  wopt C− as, wMVDR C− a0, • w R 1a ∝ ∝ MPDR ∝ − 0 all equivalent if R, C known and a = a • s 0 SINR SINR SINR when a = a • opt & MVDR MPDR s 0 SINR SINR : convergence for6 about 2N snapshots • MVDR-SMI MPDR-SMI for MVDR, N SINR for MPDR × opt Diagonal loading: helps to mitigate both finite-sample errors and • steering vector errors. Especially useful in MPDR context with low power signal of interest. Partially adaptive beamforming: enables one to achieve faster • convergence by operating in low-dimensional subspace. Especially effective with strong, low-rank interference.

Beamforming Summary 91/122 Contents

1 Introduction

2 Array processing model

3 Beamforming

4 Direction of arrival estimation Problem formulation Non parametric methods (beamforming) Parametric methods for DOA estimation Maximum likelihood estimation Subspace-based methods Covariance fitting Synthesis

Direction of arrival estimation 92/122 The direction of arrival estimation problem

Problem formulation Given a collection of K snapshots which can possibly be modeled as P y(k) = p=1 a(θp)sp(k) + n(k), estimate the directions of arrival (DoA) θ1, . . . , θ : PP ? P ˆ ˆ y(k) = p=1 a(θp)sp(k)+ n(k) θ1,..., θP ? P

Approaches Non parametric approaches which do not necessarily rely on a model • for y(k): similar to Fourier-based methods in time domain; Parametric approaches where a model is assumed and its properties • (algebraic structure, distribution) are exploited.

Direction of arrival estimation Problem formulation 93/122 Beamforming for direction finding purposes The idea is to form a beam w(θ) for each angle θ and to evaluate • 2 H 2 H the power E yF (k) = E w (θ)y(k) = w (θ)Rw(θ) at | | the output of the beamformer versus  n θ: o H 2 y(k) Pw(θ) = E w (θ)y (k) w(θ) | |  Large peaks should provide the directions of arrival: • Power at the output of w(θ) 15

10

5 ) θ ( w P 0 10 10 log -5

-10

-15 -90 -60 -30 0 30 60 90 Angle of arrival θ

Direction of arrival estimation Non parametric methods (beamforming) 94/122 Conventional beamforming for direction finding purposes Conventional beamformer

The conventional beamformer wCBF(θ) = a(θ)/N can be used, which yields the output power

H 2 H wCBF(θ)RwCBF(θ) = N − a (θ)Ra(θ)

In practice With K snapshots available, R is estimated as

K 1 Rˆ = y(k)yH (k) K Xk=1 and subsequently the output power as

2 H ˆ PCBF(θ) = N − a (θ)Ra(θ)

Direction of arrival estimation Non parametric methods (beamforming) 95/122 CBF and Fourier analysis

The estimated power at the output of the CBF writes • 1 P (θ) = aH (θ)Raˆ (θ) CBF N 2 K 1 = aH (θ)y(k) 2 KN 2 | | Xk=1 K N 2 1 i2π(n 1)f = yn(k)e− − KN 2 k=1 n=1 X X d where f = λ sin θ. The inner sum is recognized as the (spatial) Fourier transform of • each snapshot.

Direction of arrival estimation Non parametric methods (beamforming) 96/122 MPDR beamforming for direction finding purposes

Capon’s method If the MPDR beamformer

1 R− a(θ) wMPDR(θ) = H 1 a (θ)R− a(θ) is used, the output power then writes

H 1 1 H a (θ)R− RR− a(θ) 1 wMPDR(θ)RwMPDR(θ) = H 1 2 = H 1 [a (θ)R− a(θ)] a (θ)R− a(θ) which in practice yields 1 PCapon(θ) = H ˆ 1 a (θ)R− a(θ)

Direction of arrival estimation Non parametric methods (beamforming) 97/122 Comparison CBF-Capon (low resolution scenario)

Comparison CBF−CAPON 10 CBF CAPON θ = [−30◦, 10◦, 20◦]

5 ◦ θ3dB ∼ 5.1

0

−5 dB

−10

−15

−20 −100 −80 −60 −40 −20 0 20 40 60 80 100 Angle of arrival

Direction of arrival estimation Non parametric methods (beamforming) 98/122 Comparison CBF-Capon (high resolution scenario)

Comparison CBF−CAPON 10 CBF CAPON θ = [−30◦, 10◦, 15◦]

5 ◦ θ3dB ∼ 5.1

0

−5 dB

−10

−15

−20 −100 −80 −60 −40 −20 0 20 40 60 80 100 Angle of arrival

Direction of arrival estimation Non parametric methods (beamforming) 99/122 Model-based methods

Principle Based on the model y(k) = A(θ)s(k) + n(k) T where θ = θ θ θ , 1 2 ··· P  A(θ) = a(θ ) a(θ ) a(θ ) 1 2 ··· P T s(k) = s1(k) s2(k) s (k) ··· P and a(θ) stands for the steering vector. 

Direction of arrival estimation Parametric methods for DOA estimation 100/122 Classes of methods

Maximum Likelihood methods are based on maximizing the • likelihood function, which amounts to finding the unknown parameters which make the observed data the more likely. Subspace-based methods rely on the fact that the signal subspace • coincides with the subspace spanned by the principal eigenvectors of R. Moreover, the latter is orthogonal to the subspace spanned by the minor eigenvectors. These two algebraic properties are exploited for direction finding. Covariance matching relies on a model R(η) for the covariance • matrix and looks for the model parameters which minimize the distance between R(η) and the sample covariance matrix Rˆ .

Direction of arrival estimation Parametric methods for DOA estimation 101/122 Maximum Likelihood Estimation

The MLE consists in finding the parameter vector η which maximizes • the likelihood function p(Y; η) of the snapshots Y = y(1) y(2) y(k) , where η is the model parameter ··· vector.   Asymptotically efficient. , Multi-dimensional optimization problem (usually) computational ⇒ / complexity, possible convergence to local maxima.

Direction of arrival estimation Maximum likelihood estimation 102/122 Stochastic (unconditional) MLE

Assume that s(k) is Gaussian distributed with E s(k) = 0, and a • H { } covariance matrix Rs = E s(k)s (k) which is full rank. The distribution of the snapshots is thus given by •  y(k) 0, R = A(θ)R AH (θ) + σ2I ∼ CN s The likelihood function can be written as  • K N 1 y(k)H R−1y(k) p(Y; η) = π− R − e− | | kY=1

Direction of arrival estimation Maximum likelihood estimation 103/122 Stochastic (unconditional) MLE

The ML estimate is obtained as • ηˆ = arg min log p(Y; η) 2 θ,Rs,σ − 1 = arg min log R + Tr R− Rˆ 2 θ,Rs,σ | | n o Closed-form solutions for σ2 and R can be obtained so that the • s likelihood function is concentrated, yielding a minimization over the angles only:

sto H 2 θˆ = arg min log A(θ)Rˆ s(θ)A (θ) +σ ˆ (θ)I θ

Direction of arrival estimation Maximum likelihood estimation 104/122 Deterministic (conditional) MLE The signal waveforms are assumed deterministic so that • y(k) A(θ)s(k), σ2I ∼ CN The MLE is now given by  • K 2 2 2 ηˆ = arg min NK log σ + σ− y(k) A(θ)s(k) θ,s(k),σ2 k − k Xk=1 The likelihood function can be concentrated with respect to all s(k) • and σ2, and finally

ˆdet ˆ θ = arg min Tr PA⊥ (θ)R θ n o For a single source θˆdet = arg max 1 aH (θ)Raˆ (θ) CBF. • θ N ≡

Direction of arrival estimation Maximum likelihood estimation 105/122 Subspace-based methods

Eigenvalue decomposition of the covariance matrix If P signals are present, one has

A(θ) ,rank=P R{ } H 2 R = A(θ)RsA (θ) + σ I (Rs assumed full-rank) zP }| { N H H 2 = λpupup + 0 upup + σ I p=1 X p=XP +1 P N N H H 2 H = λpupup + 0 upup + σ upup p=1 p=1 X p=XP +1 X H 2 H = UsΛsUs + σ UnUn

where U = u ... u U = u ... u . s 1 P ⊥ n P +1 N     Direction of arrival estimation Subspace-based methods 106/122 Subspace-based methods

Signal and noise subspaces U = A(θ) : the signal subspace is spanned by U and R{ s} R{ } s G# hence Us = A(θ)T for some non-singular matrix T. U is orthogonal to U = A(θ) AH (θ)U = 0. R{ n} R{ s} R{ } ⇒ n H# Eigenvalues of the covariance matrix

Signal subspace

Noise subspace σ2

Subspace-based methods rely on either or . ⇒ G# H# Direction of arrival estimation Subspace-based methods 107/122 MUSIC

The signal steering vectors are orthogonal to U • n UH a(θ ) = 0 uH a(θ ) for n = P + 1,...,N n p ⇔ n p One looks for the P largest maxima of • 1 1 PMUSIC(θ) = = H H N H 2 a (θ)Uˆ nUˆ a(θ) a (θ)uˆ n n=P +1 | n| on the rationale that, as K grows large, UˆP U and hence n → n PMUSIC(θ ) . p → ∞ Many variants around MUSIC, e.g., SSMUSIC (Mc Cloud & Scharf). •

Direction of arrival estimation Subspace-based methods 108/122 Root-MUSIC T Let a(z) = 1 z zN 1 . For a ULA, one can compute the P • ··· − roots of   T 1 ˆ ˆ H PMUSIC(z) = a (z− )UnUn a(z) closest to the unit circle. The reason is that

d d T i2π sin θp H i2π sin θp H H a (e− λ )UnUn a(e λ ) = a (θp)UnUn a(θp) = 0

N 1 n PMUSIC(z) = − p z has 2(N 1) roots, (N 1) of • n= (N 1) n − − − which inside the unit− circle− since P T ˆ ˆ H PMUSIC(1/z∗) = a (z∗)UnUn a(1/z∗) H ˆ ˆ H 1 = a (z)UnUn a∗(z− ) T 1 ˆ ˆ H = a (z− )UnUn a(z) = PMUSIC(z)

Direction of arrival estimation Subspace-based methods 109/122 Low-resolution scenario

Eigenvalues of the covariance matrix 25 R theory R estimated

20

15

10 dB

5

0

−5 0 2 4 6 8 10 12 14 16 18 20

Direction of arrival estimation Subspace-based methods 110/122 Low-resolution scenario

Comparison CBF−MUSIC 25 CBF MUSIC Roots θ = [−30◦, 10◦, 20◦] root−MUSIC 90 20 1 arbitrary in R(U ) n 120 60 θ ∼ ◦ 3dB 5.1 0.8 15 0.6 150 30 10 0.4 0.2

5 dB 180 0

0

−5 210 330

−10 240 300 270

−15 −100 −80 −60 −40 −20 0 20 40 60 80 100 Angle of arrival

Direction of arrival estimation Subspace-based methods 111/122 High-resolution scenario

Eigenvalues of the covariance matrix 30 R theory R estimated 25

20

15 dB 10

5

0

−5 0 2 4 6 8 10 12 14 16 18 20

Direction of arrival estimation Subspace-based methods 112/122 High-resolution scenario

Comparison CBF−MUSIC 20 CBF MUSIC Roots θ = [−30◦, 10◦, 13◦] root−MUSIC 90 1 15 arbitrary in R(U ) n 120 60 θ ∼ ◦ 3dB 5.1 0.8

10 0.6 150 30 0.4

5 0.2

dB 180 0 0

−5 210 330

−10 240 300 270

−15 −100 −80 −60 −40 −20 0 20 40 60 80 100 Angle of arrival

Direction of arrival estimation Subspace-based methods 113/122 Min-norm

T Let d = Unη = d0 d1 dN 1 be an arbitrary vector in the • ··· − noise subspace.   N 1 n Since d a(θp), D(z) = n=0− dnz− has P of its roots equal to • d ⊥ i2π sin θp e λ and hence canP serve to estimate θp. The min-norm method searches the vector in U with minimal • R{ n} norm. To avoid d = 0, one considers

2 2 H H min d s. t. d0 = 1 min η s. t. η Un e1 = 1 d Un k k ⇔ η k k ∈R{ } T where e = 1 0 0 0 . The solution is 1 ···  UH e  U UH e η = n 1 d = n n 1 ? H H Min-Norm H H e1 UnUn e1 ⇒ e1 UnUn e1

Direction of arrival estimation Subspace-based methods 114/122 ESPRIT

Let us partition A = A(θ) as • A A = 1 = − A  −   2

where A1 [resp. A2] contains all but the last [resp. first] row of A. Then, for a ULA, we have • i2π d sin θ P A = A Φ; Φ = diag e λ p (1) 2 1 { }p=1   Φ conveys the useful information and can be deduced from (A , A ). • 1 2 The latter are unknown but U = AT can we find a similar s ⇒ relation for Us?

Direction of arrival estimation Subspace-based methods 115/122 ESPRIT

Us1 Let us partition Us as A, i.e. Us = = − . Then • U  −   s2

Us1 = A1T Us = AT ⇒ (Us2 = A2T = A1ΦT 1 U = U T− ΦT ⇒ s2 s1 U = U Ψ ⇒ s2 s1 Ψ and Φ share the same eigenvalues since Φu = λu implies that • 1 1 1 ΨT− u = T− Φu = λT− u. P i2π d sin θ It follows that the eigenvalues of Ψ are e λ p . • p=1 n o In practice one solves in a least-squares sense Uˆ = Uˆ Ψ and • s2 s1 computes the eigenvalues of Ψˆ .

Direction of arrival estimation Subspace-based methods 116/122 Subspace Fitting

Since U = A(θ) , there exists a full-rank matrix T (P P ) • R{ s} R{ } × such that Us = A(θ)T The idea is to look for the DOA which minimize the error between the • subspaces spanned by Uˆ s and A(θ) :

2 θˆ, Tˆ = arg min Uˆ s A(θ)T θ,T − W

H = arg min Tr Uˆ s A( θ)T W Uˆ s A(θ)T θ,T − − h i h i 

Direction of arrival estimation Subspace-based methods 117/122 Subspace Fitting

There exists a closed-form solution for T and finally • ˆSSF ˆ ˆ H θ = arg min Tr PA⊥ (θ)UsWUs θ n o Alternative: use the fact that • U = AH (θ) UH A(θ) = 0 R{ n} N ⇒ n  and estimate the angles as

2 ˆNSF ˆ H θ = arg min Un A(θ) θ W

Direction of arrival estimation Subspace-based methods 118/122 Covariance fitting

The covariance matrix is given by R(θ, P, σ) = R (θ, P) + Q(σ) • s P r = vec(R) = Ψ(θ)P + Σσ = Ψ(θ) Σ Φ(θ)α σ ,     The parameters are estimated by minimizing the error between R and • its estimate Rˆ :

1 θˆ, αˆ = arg min [ˆr Φ(θ)α] W− [ˆr Φ(θ)α] − − The criterion can be concentrated with respect to α: minimization • with respect to θ only.

Direction of arrival estimation Covariance fitting 119/122 Covariance fitting

In case of independent Gaussian distributed snaphots, • T Wopt = R R and covariance matching estimates are ⊗ asymptotically (i.e. when K ) equivalent to ML estimates. → ∞ In contrast to MLE, no need for assumptions on the pdf, only an • assumption on R. The criterion is usually simpler to minimize. Covariance matching can be used with full-rank covariance matrix R • s while subspace methods require the latter to be rank deficient.

Direction of arrival estimation Covariance fitting 120/122 Synthesis

Hypotheses Algorithm Performance Problems

ML distribution optimization optimal Computational cost

COMET R optimization optimal Computational cost '

MUSIC R EVD optimal Coherent signals '

Direction of arrival estimation Synthesis 121/122 Conclusions

Array processing, thanks to additional degrees of freedom, enables • one to perform spatial filtering of signals. Adaptive beamforming, possibly with reduced-rank transformations, • enables one to achieve high SINR with a fast rate of convergence in adverse conditions (interference, noise). Robustness issues are of utmost importance in practical systems, and • should be given a careful attention. Non-parametric direction finding methods are simple and robust but • may suffer from a lack of resolution. Parametric methods offer high resolution, often at the price of • degraded robustness.

Conclusions 122/122 References

1 H.L. Van Trees, Optimum Array Processing, John Wiley, 2002 2 D.G. Manolakis, V.K. Ingle et S.M. Kogon, Statistical and Adaptive , ch. 11, McGraw-Hill, 2000 3 Y. Hua, A.B. Gershman et Q. Cheng (Editeurs), High-Resolution and Robust Signal Processing, Marcel Dekker, 2004 4 J.R. Guerci, Space-Time Adaptive Processing, Artech House, 2003 5 S. A. Vorobyov, Adaptive and robust beamforming, A. M. Zoubir, M. Viberg, R. Chellappa, and S. Theodoridis, editors, Academic Press Library in Signal Processing, vol. 3, pp. 503-552, Elsevier, 2014.

References