EE 225A — DIGITAL — SUPPLEMENTARY MATERIAL 1

EE 225a Digital Signal Processing Supplementary Material

1. Allpass Sequences

A sequence ha()n is said to be allpass if it has a DTFT that satisfies

jw Ha()e = 1 . (1) Note that this is true iff jw * jw Ha()e Ha()e = 1 (2) which in turn is true iff * ha()n *ha(–n)d= ()n (3) which in turn is true iff * 1 H ()z H æö---- = 1. (4) a aèø* z

For rational Z transforms, the latter relation implies that poles (and zeros) of Ha()z are cancelled * 1 * 1 by zeros (and poles) of H æö---- . In other words, if H ()z has a pole at zc= , then H æö---- must aèø a aèø z* z* have have a zero at the same zc= , in order for (4) to be possible. Note further that a zero of * 1 1 H æö---- at zc= is a zero of H ()z at z = ----- . Consequently, (4) implies that poles (and zeros) aèø a z* c*

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 2

have zeros (and poles) in conjugate-reciprocal locations, as shown below:

1/c* c

Intuitively, since the DTFT is the Z transform evaluated on the unit circle, the effect on the mag- nitude due to any pole will be canceled by the effect on the magnitude of the corresponding zero. Although allpass implies conjugate-reciprocal pole-zero pairs, the converse is not necessarily true unless the appropriate constant multiplier is selected to get unity magnitude. Consider the Z trans- form of the form

z–1 – c* Ha()z = ------. (5) 1 – cz–1

This has a pole at c and a zero at 1/c* , as shown in the above figure. It is easy to verify that it is allpass, so an allpass Z transform may consist of the product of any number of such terms. Any non-unity multiplier on (5) would prevent this from being allpass, by the definition in (1). Note further that a “pure delay” term z–M for M > 0, which is obviously allpass, has M poles at zero and M zeros at infinity, so it automatically satisfies the allpass property. So does a “pure ad- vance” term zM for M > 0. A of the form

z–N +a*z–1N + ++... a* H ()z =zM ------1 N (6) a –1 –N 1 +a1z ++... aNz is a product of a pure delay or advance term and terms of form (5), and hence is allpass. Conversely, in order to have pole-zero pairs at conjugate reciprocal locations, and unity magnitude on the unit circle, any rational allpass Z transform must have form (6). This can also be written Az() Ha()z = ------(7) zNM– A*æö-----1 èøz* where * * N Az()= 1 +a1z ++... aN z . (8)

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 3

Evaluating this at ze= j w it is easy to directly verify that (7) is allpass. Note that there is no need for the allpass filter to be causal for it to be stable. It can have poles at infinity, with corresponding zeros at zero. It can also have poles anywhere outside the unit circle, as long as there is a corresponding zero at the reciprocal-conjugate location.

2. Minimum-Phase Sequences

Definition: The sequence hm()n with rational Z transform Hm()z is minimum phase if it: 1. is stable, 2. is causal, and 3. has all zeros inside or on the unit circle. Be careful with zeros at infinity, as introduced for example by an uncanceled z–1 term, which make a sequence non-minimum phase. The sequence is strictly minimum phase if it is minimum phase and also has no zeros on the unit circle. An LTI system is said to be minimum phase if its is a minimum-phase sequence. Importance: A strictly minimum-phase system has a stable, causal, and minimum-phase inverse. This is easily seen by observing that the poles of the inverse are zeros of the system, and zeros of the inverse are poles of the system. Fact: Any stable rational Z transform Hz() can be factored into a cascade of a minimum-phase and an allpass Z transform,

Hz()= Ha()z Hm()z . (9) Proof is by construction: Locate the poles and zeros of Hz(). Assign all those inside the unit or on

the unit circle to Hm()z . For each zero (or pole) outside the unit circle at location c , assign to * * Hm()z a zero (or pole) at 1/c . Then assign a pole (or zero), to cancel this zero (or pole), at 1/c

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 4

to Ha()z , and a zero (or pole) at c to make the filter allpass. In pictures,

Hz() Ha()z Hm()z

=

Then select the constant multipliers so that Ha()z is allpass. Notice in the above example that hn() is not real (because the poles and zeros are not in complex-conjugate pairs), and furthermore it is not causal (because if it is stable, the ROC is annular). Consequently, the allpass filter is also not causal. –1 Key idea: The minimum-phase filter has an inverse Hm ()z that has the inverse magnitude frequen-

cy response of Hz(). If Hz() has no zeros on the unit circle, then Hm()z will be strictly minimum –1 phase, so Hm ()z will be stable, causal, and minimum phase. Hence, the factorization in (9) allows us to invert any magnitude , as long as there are no zeros on the unit circle.

3. Real-Valued Discrete-Time Fourier Transforms

The DTFT He()j w of a sequence hn() is real iff He()j w = H*()ej w (10) which is true iff hn() is conjugate-symmetric, hn()= h*()–n . (11) Taking Z transforms on both sides, we see that He()j w is real-valued iff 1 Hz()= H*æö----- . (12) èøz* Since He()j w is real, the phase modulo p is zero, so hn() is sometimes called a zero-phase se-

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 5

quence. Note, however, that this terminology ignores the ambiguity of multiples of p. Equation (12) implies that if there is a zero at zc= , then there is also a zero at z = 1/c*. Simi- larly, if there is a pole at zd= , then there is also a pole at z = 1/d*, as shown in the following diagram:

1 ------1 * d c* d c

Note that this not the same type of symmetry encountered for allpass sequences, which had pole- zero pairs at conjugate-reciprocal locations. These symmetry relationships also apply to poles and zeros at zero and infinity. A factor zM for M ¹ 0 in a Z transform introduces M zeros (or poles) at z = 0 without matching zeros (or poles) at z = ¥ , so a zero-phase Z transform cannot have such a term. Correspondingly, except for the trivial case Hz()= 1 , a zero-phase sequence is neither causal nor anticausal. Any poles at zero must be matched by an equal number of poles at infinity. Although a real-valued DTFT implies pole-pole and zero-zero pairs at conjugate-reciprocal loca- tions, note that the converse is not true unless the multiplicative constant is carefully selected. Consider the following example,

Hz()= ()1 – cz–1 ()1 – c*z . (13)

This has a zero at zc= and a zero at z = -----1 . It also has a pole at zero (due to the first factor) and c* a pole at infinity (due to the second factor). The general form of this example can be written * 1 Hz()= Az()A æö---- (14) èø* z Evaluating this on the unit circle we see that 2 He()jw ==Ae()jw A*()ejw Ae()jw , (15) which is not only real, but also non-negative. A related example,

Hz()= –1()1 – cz–1 ()– c*z (16) is real, but not non-negative on the unit circle. A product or ratio of some number of terms of forms

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 6

(13) and (16) will be real on the unit circle, but is not necessarily non-negative.

4. Linear-Phase Sequences

Suppose

Fe()j w = e–jwtHe()j w (17) where He()j w is real. Then

ÐF()ej w = wt modulo p, (18) so fn() is called a linear-phase sequence. Again, the terminology ignores the ambiguity of mul- tiples of p. If t = N is an integer, then

Fz()= z–NHz() (19) so that fn()= h()nN– . (20) This might take a two-sided finite sequence hn() and make a causal version.

5. Positive Semi-Definite Sequences

The sequence hn() with DTFT He()j w is said to be positive semi-definite (or non-negative defi- nite) if He()j w is real and He()j w ³ 0 . (21)

As with any zero-phase sequence, a pole (or zero) at c implies a pole (or zero) at 1/c* . However, now we also have the stronger result that zeros or poles on the unit circle must be double.

Theorem: If the sequence hn() has a real-valued DTFT He()jw and rational Z transform Hz(), then zeros or poles on the unit circle must be double.

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 7

Since the proof is rather involved, it is worth giving an intuitive argument first. A single pole or zero on the unit circle causes a phase change of p when the frequency response moves across it. Consequently, if the frequency response is positive on one side of the lone zero or pole, it will be negative on the other side. Showing this formally is a bit more painful. Also note that it follows from this theorem that any positive semi-definite Hz() can be factored as in (14). The theorem, as stated, justifies that the poles and zeros can be separated as in (14). Fur- thermore, if we assign all the poles and zeros that are inside or on the unit circle to the first factor in (14), we see that we can write the factorization as

* 1 Hz()= H ()z H æö---- (22) m mèø z*

where Hm()z is minimum phase. In the course of proving the above theorem, we will clear out any doubts about the constant multipliers in the above factorization.

Proof: We have already shown that a real DTFT implies that for every pole or zero at c there must be a corresponding pole or zero at the reciprocal conjugate location. This is not sufficient to show that the on the unit circle are double however. To do this, factor the Z transform as follows

Hz()= Hs()z Hu()z (23)

where Hs()z contains all the poles and zeros that are not on the unit circle, and is written in the form

N –1 * Õ ()1 – ciz ()1 – ci z H ()z = ------i = 1 . (24) s M –1 * Õ ()1 – diz ()1 – di z i = 1

It is easy to verify that this is real-valued and non-negative on the unit circle. For simplicity, we now assume Hz() has no poles on the unit circle (i.e. it is stable). This assumption is not necessary, but it simplifies the proof. In this case, the remaining factor can be written

K L –1 Hu()z = Cz Õ ()1 – aiz (25) i = 1

where zL accounts for all the poles and zeros at zero and infinity not accounted for in (24),

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 8

jqi ai = e accounts for all the zeros on the unit circle, and C is some constant. Noting that each –1 term ()1 – aiz accounts for not just a zero at za= i , but also a pole at z = 0 , we might observe that these poles need to be balanced with poles at infinity for the DTFT to be real. Since a factor z gives us a pole at infinity and a zero at zero, we could conclude that it is necessary that LK= /2. Since L must be an integer, we could conclude immediately that K must be even, so the number of zeros on the unit circle must be even. This is true for any zero-phase sequence, regardless of whether it is positive semi-definite. The above conclusion can be reinforced by observing that He()j w is real-valued iff (12) is satis- fied, or

* 1 * 1 1 Hz()=H ()z H ()z ==H æö---- H æö---- Hæö---- (26) s u s èø* uèø* èø* z z z It is clear from (24) that * 1 H ()z = H æö---- (27) s s èø* z so we need only ensure that * 1 H ()z = H æö---- (28) u uèø* z From (25), this requires that

K K L –1 * –L * * 1 H ()z ===Cz ()1 – a z C z ()1 – a z H æö---- . (29) u Õ i Õ i uèø* i = 1 i = 1 z The polynomial on the left has powers of z ranging from L to LK– . The one on the right has powers of z ranging from –L to – L + K. The only way these two ranges could be the same is if K = 2L , which means K is even, as expected. Showing that the zeros must be double zeros is a bit more involved. First observe that it is necessary

that Hu()z evaluated on the unit circle be real. Write this as

2L jq H ()ejw = CejwL æö1 – e ie–jw . (30) u Õ èø i = 1 Expanding each term in the product

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 9

2L j()q – w /2 –j()q – w /2 j()q – w /2 H ()ejw = CejwL e i æöe i – e i (31) u Õ èø i = 1 which leads to

2L jw jS/2 L Hu()e = Ce ()–1 Õ sin()()qi – w /2 (32) i = 1 where

2L S = å qi . (33) i = 1 For this to be real, it is necessary and sufficient that

C= Ae–jS/2 , (34) for any real valued constant A. For it to be real and non-negative, it is necessary and sufficient

C= Ae–jS/2()–1 L (35) and that the zeros are double, so that we can write (32) as

L jw jS/2 L 2 Hu()e = Ce ()–1 Õ []sin()()qi – w /2 (36) i = 1

where we have numbered the zeros so that qi = q2Li– . It should be clear that these conditions are sufficient. To show they are necessary, observe that if we have a zero that is not a double zero, then

there will be some region ()qi – e, qi + e that crosses only one zero. The term sin()()qi – w /2 in

(32) will change signs in this region, so it cannot be non-negative on both sides of qi . Q.E.D.

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 10

6. Filtering Random Processes

For WSS random process xn() and LTI system Hz(),

xn() yn() Hz()

Define the cross correlations * rxy()m = Exn[]()y ()nm– (37) and observe that

* * * rxy()m ==åh ()k Exn[]()x ()nm– – k åh ()k rx()mk+ (38) k k which we can write * rxy()m = h ()–m *rx()m . (39) * * 1 Since the ZT of h ()–n is H æö---- , the cross power spectrum follows from this èø* z

* 1 R ()z = H æö---- Rz(). (40) xy èø z* The cross correlation is different in the other order,

* ryx()m = Eyn[]()x ()nm– (41)

or

* ryx()m ==åhk()Ex[]()nk– x ()nm– åhk()rx()mk– (42) k k or

ryx()m = hm()*rx()m . (43)

The cross power spectrum in this direction is

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 11

Ryx()z = Hz()Rx()z . (44)

It is easy to show that yn() is WSS, and to compute its power spectrum:

* r ()m = Eæöhk()x()nk– æöhi()x()nm– – i (45) y èøå èøå k i

* =åhk()åh ()i rx ()mik+ – = åhk()rxy()mk– (46) k i k so

* ry()m = hm()*h ()–m *rx()m (47)

So the output power spectrum is

* 1 R ()z = Hz()H æö---- R ()z . (48) y èø* x z

7. Innovations Process & Spectral Factorization

Definition: xn() vn() Assuming xn() is WSS and Hz() “Innovations process” — R ()z has no poles or zeros 2 x white, Rv(z)s= v . on the unit circle. “Whitening filter” — Caus- al, monic, and strictly mini- mum phase. Define the “inverse filter”: vn() xn() Gz()

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 12

whereGz()= H–1()z . Since vn() is white, samples are uncorrelated, and hence in some sense each one brings new information about the random process xn(). When such a representation ex- ists, where Gz() is stable and causal, then xn() is called a linear process. The power spectrum of xn() can be written

2 * 1 R (z)s= Gz()G æö---- . (49) x v èø* z

This is called the “canonical spectral factorization” of Rx()z . Knowing the whitening filter and the power of its output implies knowing the power spectrum, be- cause it implies knowing Gz()= H–1()z , and

jw 2 jw 2 Rx()se = v Ge(). (50) The “standard normalization” results from constraining Gz() and Hz() to be monic, i.e. g()0 = 1 and h()0 = 1 . (51) Since these are causal, G()¥ = 1 and H()¥ = 1 . (52) 2 This defines the value sv unambiguously. For rational polynomial forms,

()1 – b z–1 Hz()= ------Õ i (53) –1 Õ()1 – ai z The key result: Given any rational power spectrum without poles or zeros on the unit circle, there exists a canonical spectral factorization, and hence a monic, minimum-phase whitening filter, and a monic minimum-phase synthesis filter. This is a form of Wold’s theorem. The proof is by construction. Gz() gets the poles and zeros inside the unit circle, leaving the ones outside for G*()1/z* .

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 13

E.g.:

G*æö----1 èø* z Gz()

Sx()z is positive semi-definite, so zeros on the unit circle are double, and create no difficulties.

8. Power of the Innovations Process

Fact: the power of the innovations process equals the geometric mean of the power spectrum:

p 2 ìü1 s = expíý------lnR[]()w dw . (54) v 2p ò x îþ–p

This can be recognized as a geometric mean by considering the limiting case of

N N æö1 x = expç÷---- lnx , (55) N Õ n N å n n = 1 èøn = 1 where the sum becomes an integral. From spectral factorization we can write

p p 1 2 1 j w 2 ------lnR[]()w dw = ln()s + ------lnGe()dw. (56) 2p ò x v 2p ò –p –p We need to show that the second term is zero, given that Gz() is monic and minimum phase. Recall that the complex logarithm of a a is

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 14

lna()= lnaja+ ()Ð + 2kp (57)

for any integer k . The principal value uses k = 0 . This expression for the complex logarithm is easy to understand from the form a= aeja()Ð + 2kp . (58) We do not always need or want the principal value. It is easy to see that we can always pick appro- priate values of each of the logarithms (not necessarily the principal values) such that ln()ab = lna()+ lnb(). (59) Hence, whenever we write such a relation, we insist that compatible values of the multivalued log- arithms be used. Consider

lnGzG*æö----1 = lnGz + lnG*æö----1 . (60) () èø []() èø z* z* Note that Gz() has all poles and zeros inside or on the unit circle. Hence, it is analytic and non- zero outside the unit circle. This makes lnGz[]() analytic outside the unit circle, which implies that there exists a causal sequence ck() such that

¥ –k lnGz[]() = å ck()z . (61) k = 0 This sequence is called the complex cepstrum of gn(). Although ck() is not necessarily stable (it is unstable if Gz() has zeros on the unit circle), it is nonetheless bounded (proven below). Hence, c()0 = lnG[]()¥ . (62) Since Gz() is monic and causal, G()¥ = 1 so c()0 = 0 . Furthermore, c()0 is the inverse DTFT of lnGe[]()j w evaluated at n = 0 , so

p 1 j w c()0 ==------lnGe[]()dw 0 . (63) 2p ò –p Note that the imaginary part of this integral is always zero for a real random process xn() as long as we choose values of the multi-valued complex logarithm that are conjugate symmetric about w = 0 . We require such a choice.

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 15

* 1 * 1 Use a similar argument for the integral of lnGæö---- , observing that G æö---- is anticausal and èø* èø* z z monic to get

p 1 * j w ------lnG[]()e dw = 0. (64) 2p ò –p Combining these two zero-valued integrals,

p 1 j w 2 ------lnGe()dw = 0 . (65) 2p ò –p Hence,

p 1 2 ------lnR[]()w dw = ln()s (66) 2p ò x v –p and the desired result follows. The only loose end is to prove that the complex cepstrum cn() is bounded even when it is not sta- ble (i.e. when Gz() has zeros on the unit circle). We will show this only for rational polynomial power spectra, for which Gz() can be written as

N –1 Õ ()1 – ai z Gz()= ------i = 1 , (67) M –1 Õ ()1 – bi z i = 1

where aj £ 1 and bj < 1 . For this form,

M N –1 –1 lnGz()() = å ln()1 – ai z – å ln()1 – bi z (68) i = 1 i = 1 where again we must use compatible values for the complex logarithm. –1 Consider each term ln()1 – ai z . It can be written as a power series expansion

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 16

n ¥ ()a z–1 ln()1 – a z–1 = – ------i , (69) i å n n = 1

–1 valid for ai z < 1 , or z > ai . From this expression we can recognize the inverse Z transform,

ì 0; n £ 0 ï –1 –1 n ZT []ln()1 – ai z = í–a (70) ï------i ; n > 0 î n

Note that although this is not absolutely summable if ai = 1 (and hence is not stable), it is bounded for all n. Furthermore,

ì 0; n £ 0 ï ZT–1[]lnGz()() ==cn() M n N n (71) í a bi ï– ------i + ----- ; n > 0 å n å n î i = 1 i = 1 is bounded for all n. A similar argument can be applied to show that the anticausal sequence

–1 ZT lnGæö*æö----1 . (72) èøèø z* is also bounded.

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 17

9. Wiener Filters

sn() xn() LTI yn() en() fn() signal filter

wn() our design dn() noise

Objective is to minimize the power in en() (MSE criterion).

Example applications: 1. dn()= sn() (try to minimize the noise) 2. dn()= s()nm+ (try to predict sn()) 3. dn()= x()nm+ (try to predict xn())

Assumption: • xn() and dn() are jointly WSS (Exn[]()d*()nm– is independent of m).

Constraint: • fn() is usually, but not always, constrained to be causal.

10. Wiener-Hopf Equations

Minimize the mean-squared error (MSE):

e = Edn[]()– yn()2 . (73) Expanding this,

e = Edn[]()d*()n (74)

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 18

* * * –åf ()k Edn[]()x ()nk– – åfk()Ed[]()n x()nk– (75) k k

* * + ååfm()f ()k Ex[]()nm– x ()nk– (76) k m Hence

ìü* * e = rd()0 – 2Re íýåf ()k rdx()k + ååfm()f ()k rx()km– (77) îþk k m

This is a real-valued quadratic function of the possibly complex variables fi() for all i . This sug- gests that by taking the partial derivative of e with respect to each fi(), and setting this to zero, we can find that values of fi() that minimize e . However, this derivative in general does not exist. In df* particular, does not exist anywhere. However, note that e is a real-valued quadratic function df

of the real and imaginary parts fR()i and f I()i of fi() for all i . Hence, we can take the partial de- rivatives with respect to each of these real-valued variables and set those to zero. The values of fi() that make the partial derivatives zero yield a minimum e if the function is unimodal. We assume for now that it is. Instead of setting the partial derivatives to zero, we can equivalently set the following complex gra- dients1 to zero,

==---1æö¶ – j ¶ 0 (78) Ñfi()e èøe 2 ¶fR()i ¶fI()i

and

1. Note that the complex gradient has other definitions, such as Ñ e = æö¶ + j ¶ e . This would serve equally f èø ¶fR ¶fI well for our optimization problem and is used in [S. M. Kay, Modern Spectral Estimation, Prentice-Hall, 1988] and [S. Haykin, Adaptive Filter Theory, 2nd Ed., Prentice-Hall, 1991]. Our definition is used in [C. W. Therrien, Discrete Random Signals and Statistical Signal Processing, Prentice-Hall, 1992] and [D. H. Brandwood, “A Complex Gradient Operator and its Application in Adaptive Array Theory,” IEE Proceedings, 130(1), pp. 11-16, February 1983]. The key advantage of our definition is that if the function being differentiated is an analytic ¶e function of f , then = Ñ e . ¶f f

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 19

1æö¶ ¶ Ñ * e ==--- + j e 0 (79) f ()i èø 2 ¶fR()i ¶fI()i

Equivalence follows from the observation that ¶e Ñe* e + Ñfi = (80) f ()i () ¶f R()i and ¶e Ñe* e – Ñfi = j . (81) f ()i () ¶fI()i With our definition of the complex gradient, it is easy to verify that * * * Ñf f = 0 , Ñf f = 1 , and Ñf ff = f (82) and * * Ñ * f = 1, Ñ * f = 0, and Ñ * ff = f. (83) f f f Using these we get

Ñ * e = –r ()i + fk()r ()ik– . (84) f ()i dx å x k It is easy to show that * Ñefie = ()Ñ * (85) () f ()i

(using the conjugate symmetry of rx()m ), so it is sufficient to set only one of the two gradients to

zero (either one). Setting Ñ * e = 0 we get the Wiener-Hopf Equations: f ()i

åfk()rx()ik– = rdx()i , for all i (86) k

A solution to these equations will minimize e. We can easily modify these equations to constrain the filter fn() to be causal by just modifying the limits on the summation,

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 20

¥ å fk()rx()ik– = rdx()i , for i ³ 0 . (87) k = 0

11. Some Easy Examples:

1. Suppose dn()= xn(). Then rdx()i = rx()i . The Wiener-Hopf equations are satisfied by fn()d= ()n . (88) Interpretation:

xn() en() d()n

dn()

2. Suppose dn()==sn() xn()– wn(), where wn() is white and independent of xn() (a very strange situation. Again rdx()i = rx()i , so the solution is the same. Interpretation:

sn() xn() LTI yn() en() fn() signal filter

wn() dn() noise

Because xn() is independent of the noise wn(), there is nothing the filter can do to remove it. So the solution is the same as before.But a more realistic scenario would be for wn() to be independent of sn(). 2 3. Suppose xn() is white. I.e. rx (k)s= d()k . Then the Wiener-Hopf equations are satisfied by

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 21

r ()n fn()= ------dx un(), (89) s2 where un() is the unit step function (so this filter is causal). Knowing the joint statistics of xn() and dn(), we can find the MMSE filter that estimates dn() from xn(). The solution is easy when xn() is white. For other situations, we can use a whitening filter to construct a general solution.

12. Non-Causal Wiener Filter Solution

Equation (86), which has no causality constraint on the Wiener filter, can be written

fi()*rx()i = rdx()i . (90) Taking Z transforms and solving for Fz() we get R ()z Fz()= ------dx . (91) Rx()z This is sometimes called the unrealizable Wiener filter, but we prefer to call it the non-causal Wiener filter.

13. Causal Wiener Filtering, Rational Power Spectra

The causality constraint in (87) complicates things considerably. Conceptually break the filter into two parts:

sn() xn() vn() yn() en() hn() f ¢()n

wn() fn() dn()

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 22

where hn() is the whitening filter for xn(), vn() is the innovations process, and f ¢()n is the Wien- er filter designed for the white input. I.e. r ()n f ¢()n = ------dv un() (92) 2 sv Notation:

¥ –k []Rdv()z + = å rdv()k z . (93) k = 0 (I.e., the Z transform of the nonnegative side only). With this 1 F¢()z = ------[]R ()z . (94) 2 dv + sv

A more convenient form would express F¢()z as a function of Rdx()z , which is assumed to be known.

14. Derivation of the General Solution

Note that since

¥ vn()= å hk()x()nk– (95) k = 0

* rdv()m = Edi[]()v ()im– (96)

¥ * * = å h ()k Edi[]()x ()im– – k (97) k = 0

¥ * = å h ()k rdx()mk+ . (98) k = 0 This relation can be written

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 23

* rdv()–m = h ()m *rdx ()–m . (99) Taking Z transforms on both sides,

1 * * 1 R æö--- = H ()z R æö--- (100) dvèøz dx èøz or * 1 R ()z = H æö---- R ()z . (101) dv èø* dx z Hence, the overall Wiener filter solution is

1 * 1 Fz()==Hz()F¢()z ------Hz()H æö---- R ()z (102) 2 èø* dx sv z +

15. A More Involved Example

2 Assume in the following that un()is white noise, or Ru(z)s= u :

un() sn() xn() yn() en() ------1 fn() 1 – az–1

wn() dn()

2 Assume further that wn() is white noise ()Rw(z)s= w , uncorrelated with sn()= dn(). In words, we are given noisy observations of a first-order AR process, and need to design a Wiener filter to remove the noise as much as possible.

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 24

16. Linear Equalization

Scenario:

dn() xn() yn() en() Cz() Fz()

The “channel” Cz() is assumed to be causal and stable. The “data” dn() is assumed to be white 2 with power spectrum sd . Our task is to design the Wiener filter Fz() to minimize the power in en(). The general solution is

1 * 1 Fz()= ------Hz()H æö---- R ()z (103) 2 èø* dx sv z + 2 where sv is the power of the innovations process for xn(), and Hz() is the whitening filter. To get a more specific result, observe that we can factor Cz() into an allpass and a minimum phase component,

Cz()= Cm()z Ca()z , (104) where by definition we scale these so that jw Ca()e = 1 . (105) The whitening filter (which by definition is monic) is –1 Hz()= KCm ()z , (106) where K is the constant needed to make it monic. The innovations process is generated by the sys- tem:

dn() xn() vn() Cz() Hz()

Note that since Cz()Hz()= Ca()z K, where Ca()z is allpass, then

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 25

2 2 2 sv = sd K . (107) Moreover

* 1 2 * 1 * 1 R ()z ==C æö---- S (z)sC æö---- C æö---- . (108) dx èød d aèømèø z* z* z* Putting this all together, plugging (106), (107), and (108) into (103) we get

–1 * 1 Fz()= C ()z C æö---- . (109) m aèø* z +

If Cz() happens to be minimum phase, i.e. Cz()= Cm()z and Ca()z = 1, then intuitively we expect Fz()= C–1()z , which is exactly what we get. If it is not minimum phase, however, then * 1 * 1 * we have the term C æö---- . Note that the inverse Z transform of C æö---- is c ()–n . But since aèø* aèø* a z + z * Ca()z causal, the non-negative time part of this consists of exactly one sample, ca(0)d()n . Hence

* 1 * C æö---- = c ()0 (110) aèø* a z + and we can write the final solution as * –1 Fz()= ca()0 Cm ()z . (111) As a specific example, suppose 1 – cz–1 Cz()= ------(112) 1 – pz–1 where p < 1 , so the system is stable, but c > 1 , so the system is not minimum phase. In this case,

– c* + z–1 Cm()z = ------(113) 1 – pz–1 and

1 – cz–1 Ca()z = ------. (114) – c* + z–1

To verify this, observe that the product equals Cz(), that Ca()z evaluates to unity on the unit cir-

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 26

* cle, and that Cm()z is minimum phase. To find ca()0 we can use the final value theorem, evalu- * 1 ating C æö---- at z = 0 (because it is anti-causal). This is aèø* z

* 1 * 1 C æö---- ==------1 – c z –--- . (115) aèø* c z z = 0 – c + z z = 0 Plugging this into (111) we get

1 – pz–1 1 1 – pz–1 Fz()==------. (116) * –1 2 –1 –1 – c + z c 1 + ()c* z We see that instead of inverting the zero outside the unit circle, we invert its conjugate reciprocal. Furthermore, the closer c gets to the unit circle, the closer the solution gets to the solution when Cz() is minimum phase. Moreover, the further c gets from the unit circle, the closer the Wiener filter gets to zero. This latter result is intuitive because the filter is getting further from minimum phase, so the inverse of the minimum phase part is less useful.

17. Normal Equations

Let f be a vector of dimension M, representing the taps of an FIR Wiener filter (i.e. the first M values of the causal impulse response). For this filter to be optimal, the taps must satisfy the normal equations, * * Rxf = rdx . (117) Assuming that these equations result from minimizing a unimodal quadratic function, it follows that there must be at least one solution for f . This conclusion can be reinforced algebraically by * proving that rdx is always in the range space of Rx . We write this * rdx Î Â()Rx , (118)

where Â()Rx is simply the set of vectors that can be written as Rxp for some p.

Consider a vector q in the nullspace of Rx , meaning that Rxq = 0 , where 0 is the zero vector. This is written

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 27

q Î h()Rx . (119) For this vector, H q Rxq = 0 . (120) Letting x()n = []xn(),,xn()– 1 ... ,x()nM–1+ T , we can write

H Rx = E[]x()n x ()n . (121) Hence

qHE[]x()n xH()n q = 0 (122) or

E[]qHx()n xH()n q = 0 . (123) Noticing that this is the product of two complex scalars that are complex conjugates of one another, we can write it as 2 E[]qHx()n = 0. (124)

From this we infer that qHx()n = 0 with probability one. In words, with probability one, an out-

come of the random vector x()n will be orthogonal to any vector q in the nullspace of Rx . Now, note that * * rdx = Ed[]()n x()n (125) so that H * * H q rdx = Ed[]()n q x()n . (126) H * But since q x()n = 0 with probability one, this inner product is zero, so rdx is orthogonal to any

vector q in the nullspace of Rx . The following two facts from linear algebra let us infer directly * H from this that rdx is in the range space of Rx . These facts use the property that Rx = Rx : M Â(Rx )^h()Rx and Â()hRx Å ()Rx = C .(127)

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 28

18. Geometric Series

n n + 1 am = ------1 – a for any n > 0 (128) å 1 – a m = 0 To prove, just multiply both sides by 1 – a . Note that, taking the limit as n ® ¥ ,

¥ m 1 a = ------, for any a < 1 . (129) å 1 – a m = 0 1. Dirac Delta Functions The d()t is defined by the following two relations: d()t = 0 for all t ¹ 0 (130) and

¥ ò d()t dt = 1 . (131) –¥ From this definition, we can get the following identity

ad(at)d= ()t (132)

valid for any real a. To prove this, just observe that ad()at = 0 for all t ¹ 0 (133) and

¥ ò ad()at dt = 1 . (134) –¥ The latter is obtained by a change of variables, where t gets replaced by at and hence dt get re- placed by adt.

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 29

19. Working with Impulse Trains

The following identity relates an infinite sum of complex exponentials with an infinite Dirac delta train:

–j n åe w = 2pådw()– 2pk . (135) n k

It can be verified by observing that the left hand side is the DTFT of the signal xn()= 1, from the definition of the DTFT. So applying the inverse DTFT formula to the right hand side, we should obtain xn()= 1 . Doing this,

p 1 jwn xn()= ------ò 2pådw()– 2pk e dw . (136) 2p k –p Note that the limits of the integral enclose exactly one of the impulses in the train (the one at w = 0 ). So applying the sifting rule, we get xn()= 1. Another form of the same identity relates a delta train in the time domain with a sum of exponen- tials in the frequency domain:

2p j ------mt 1 T åd()t– kT = --- åe (137) k T m

We can verify this by changing variables in the previous identity. But for extra reinforcement, let us verify it independently by letting xt()d= å ()t– kT (138) k and viewing the sum of exponentials as a Fourier series expansion for the periodic function xt(), which has period T. Find the Fourier series coefficients using the formula

T/2 –j------2pmt T/2 –j------2pmt 1 T 1 T Xm()==--- xt()e dt --- d()t– kT e dt. (139) T ò T ò å –T/2 –T/2 k Again, the integral encloses just one of the delta functions, so we apply the sifting rule. A more

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 30

formal way to simplify the above expression is to exchange the integral and summation and change variables, letting t = t– kT, getting

2p kTT+ /2 –j ------m()t + kT 1 T Xm()= --- dt()e dt (140) Tå ò k kTT– /2 We can recognize this as a sum of integrals with adjoining limits and simplify to

j------2pmkT 1 T 1 Xm()==--- e --- . (141) T T Since the Fourier Series coefficients are constant, the Fourier Series representation of xt() is an infinite sum of exponentials with equal weight. From this representation, it is easy to see that the Fourier transform of xt(), which can itself be written as a sum of exponentials, can also be expressed as a sum of Dirac delta functions,

–jwmT 2p 2p FT t– kT ==e ------æö– ------m (142) åd()å ådwèø k m T m T

The first of these expressions comes from the Fourier integral and the sifting rule of Dirac delta functions. The second follows from the first identity (after some manipulation), or can be derived from knowledge that

j------2pmt T 2p FTe = 2pdwæö– ------m (143) èøT (easily verified using the inverse Fourier integral) and the fact that xt() is a linear combination of such terms. Substituting N in place of T we get the related identity

e–jwmN = ------2p æö– ------2pm (144) å ådwèø m N m N

The first of these two forms,

j –j mN Xe()w = åe w , (145) m is easily recognized as the DTFT of

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 31

xn()d= å ()n– mN (146) m where d()n is now the Kronecker delta function. (To verify this, just note that the DTFT of each component d()n– mN is e–jwmN, from the DTFT definition). If we apply the inverse DTFT for- mula to the second of the two forms for Xe()jw , after some non-trivial labor, we get the new iden- tity for Kronecker delta trains

N – 1 j------2pkn 1 N d()n– mN = ---- e (147) å N å m k = 0

Note that unlike the corresponding identity for Dirac delta trains, the sum of exponentials is finite.

To verify this more easily than applying the inverse DTFT formula, note that the finite sum of ex- ponentials is exactly the inverse DFT of Xk()= 1, as is easily seen by comparing the above ex- pression to the inverse DFT formula. Hence, we can just verify that DFTxn[]() = 1, where we use the DFT of length N. (Equivalently, we could view the above sum of exponentials as a discrete Fourier series (DFS) and just verify the DFS coefficients.)

N – 1 –j ------2pnk N Xk()==DFTxn[()]då å ()n– mN e (148) n = 0 m Exchanging summations, and changing variables using r= n– mN we get

2p –1mNN+ – –j------()r+ mN k N Xk()d= å å ()r e (149) m r= –mN The double summation is actually a sum of sums with adjoining limits, so 2 2 2 –j------p()r+ mN k –j ------pNk N N Xk()d=å ()r e ==e 1. (150) r Thus our identity is verified. A side benefit of this approach is that we have verified that

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 32

DFT åd()n– mN = 1 (151) m

for a DFT of order N. We can show similarly that

DFT[]1 = Nåd()m– kN , (152) k

again for a DFT of order N.

We can now use identity (147) to find the DTFT of xn()d= å ()n– mN . (153) m Applying the DTFT definition,

N – 1 j ------2pkn jw 1 æöN –jwn Xe()= ---- ç÷e e . (154) åN å n èøk = 0 Exchanging the order of summations and combining the exponentials,

2p N – 1 –jæöw – ------k n jw 1 èøN Xe()= ---- e . (155) N å å k = 0 n We can now use identity (135) to represent the inside summation as a delta train,

N – 1 jw 2p k Xe()= ------dwæö– 2pæöm + ---- . (156) N å å èøèøN k = 0 m Exchanging summations again, we can recognize this as a sum of sums with adjoining limits, get- ting the simpler expression

DTFT n– mN = ------2p æö– 2 ----k (157) åd()ådwèøp m N k N

As a picture:

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY EE 225A — DIGITAL SIGNAL PROCESSING — SUPPLEMENTARY MATERIAL 33

w

0 2p ------2p N (158) The DTFT of a discrete-time impulse train is an impulse train in frequency, just as the FT of a con- tinuous impulse train is an impulse train in frequency.

PROF. EDWARD A. LEE, DEPARTMENT OF EECS, UNIVERSITY OF CALIFORNIA, BERKELEY