<<

Chapter

Multiple

Intro duction

We now consider the situation where we have a numb er of time series and wish to

explore the relations b etween them We rst lo ok at the relation b etween cross

correlation and multivariate autoregressive mo dels and then at the crosssp ectral

density and coherence

Crosscorrelation

Given two time series x and y we can delay x by T samples and then calculate the

t t t

cross b etween the pair of signals That is

N

X

x y T

tT x t y xy

N

t=1

where and are the of each time series and there are N samples in

x y

each The function T is the crosscovariance function The crosscorrelation is

xy

a normalised version

T

xy

q

r T

xy

xx y y

2 2

where we note that and are the of each signal

xx y y

x y

Note that

xy

r

xy

x y

which is the correlation b etween the two variables Therefore unlike the auto corre

lation r is not generally equal to Figure shows two time series and their

xy

crosscorrelation

Signal Pro cessing Course WD Penny April

Crosscorrelation is asymmetric

First we recap as to why the autocorrelation is a symmetric function The auto co

for a zero signal is given by

N

X

x x T

tT t xx

N

t=1

This can b e written in the shorthand notation

T x x

xx tT t

where the angled brackets denote the average value or expectation Now for negative

lags

T x x

xx t+T t

Subtracting T from the time index this will make no dierence to the exp ectation

gives

T x x

xx t tT

which is identical to T as the ordering of variables makes no dierence to the

xx

exp ected value Hence the auto correlation is a symmetric function

The crosscorrelation is a normalised crosscovariance which assuming zero mean

signals is given by

T x y

xy tT t

and for negative lags

T x y

xy t+T t

Subtracting T from the time index now gives

T x y

xy t tT

which is dierent to T To see this more clearly we can subtract T once more

xy

from the time index to give

T x y

xy tT t2T

Hence the crosscovariance and therefore the crosscorrelation is an asymmetric

function

To summarise moving signal A right forward in time and multiplying with signal

B is not the same as moving signal A left and multiplying with signal B unless signal

A equals signal B

Windowing

When calculating crosscorrelations there are fewer data p oints at larger lags than

at shorter lags The resulting estimates are commensurately less accurate To take

account of this the estimates at long lags can b e smo othed using various window

op erators See lecture

Signal Pro cessing Course WD Penny April

4

3

2

1

0

−1

−2

0 20 40 60 80 100

a

Figure Signals x top and y bottom

t t

0.5

0.4

0.3

0.2

0.1

0

−0.1

−0.2

−0.3

−0.4

−100 −50 0 50 100

a

Figure Crosscorrelation function r T for the data in Figure A lag of

xy

T denotes the top series x lagging the bottom series y Notice the big positive

correlation at a lag of Can you see from Figure why this should occur

Signal Pro cessing Course WD Penny April

TimeDelay Estimation

If we susp ect that one signal is a p ossibly noisy timedelayed version of another signal

then the p eak in the crosscorrelation will identify the delay For example gure

suggests that the top signal lags the b ottom by a delay of samples Given that the

sample rate is Hz this corresp onds to a delay of seconds

Multivariate Autoregressive mo dels

A multivariate autoregressive MAR mo del is a linear predictor used for mo delling

multiple time series An M AR p mo del predicts the next vector value in a d

dimensional time series x a row vector as a linear combination of the p previous

t

vector values of the time series

p

X

xt xt k ak e

t

k =1

where each a is a d by d of AR co ecients and e is an I ID Gaussian

k t

noise vector with zero mean and covariance C There are a total of n p d d

p

AR co ecients and the noise covariance matrix has d d elements If we write the

lagged vectors as a single augmented row vector

x t xt xt xt p

and the AR co ecients as a single

T

A a a ap

then we can write the MAR mo del as

xt x tA et

The ab ove equation shows the mo del at a single time p oint t

The equation for the mo del over all time steps can b e written in terms of the emb ed

ding matrix M whose tth row is x t the error matrix E having rows et p

and the target matrix X having rows xt p This gives

X MA E

which is now in the standard form of a multivariate problem The

AR co ecients can therefore b e calculated from

1

T T

M M A M X

and the AR predictions are then given by

x t x tA

Signal Pro cessing Course WD Penny April

The predicion errors are

et xt x t

and the noise covariance matrix is estimated as

T

e te t C

N n

p

The denominator N n arises b ecause n degrees of freedom have b een used up to

p p

calculate the AR co ecients and we want the estimates of covariance to b e unbiased

Mo del order selection

Given that an MAR mo del can b e expressed as a multivariate linear regression prob

lem all the usual mo del order selection criteria can b e employed such as stepwise

forwards and backwards selection Other criteria also exist Neumaier and Schneider

and Lutkep ohl investigate a numb er of metho ds including the Final Predic

tion Error

N n

p

2

FPE p log log

N n

p

where

1d

2

detN n C

p

N

1

but they prefer the Minimum Description Length MDL criterion

n N

p

2

log log N MDLp

Example

Given two time series and a MAR mo del for example the MAR predictions are

x t x tA

a

a x t xt xt xt

a

h i h i

x t x t x t x t x t x t x t x t

1 2 1 2 1 2 1 2

a a

11 12

a a

21 22

a a

11 12

a a

21 22

a a

11 12

a a

21 22

1

The MDL criterion is identical to the negative value of the Bayesian Information Criterion BIC

ie MDLp BIC p and Neumaier and Schneider refer to this measure as BIC

Signal Pro cessing Course WD Penny April

4

3

2

1

0

−1

−2 0 20 40 60 80 100

t

a

Figure Signals x t top and x t bottom and predictions from MAR

1 2

model

Applying an MAR mo del to our data set gave the following estimates for the AR

co ecients a and noise covariance C which were estimated from equations

p

and

a

1

a

2

a

3

C

Cross Sp ectral Density

Just as the Power Sp ectral Density PSD is the Fourier transform of the auto

covariance function we may dene the Cross Sp ectral Density CSD as the Fourier

transform of the crosscovariance function

1

X

P w n expiw n

12 x x

1 2

n=1

Signal Pro cessing Course WD Penny April

Note that if x x the CSD reduces to the PSD Now the crosscovariance of a

1 2

signal is given by

1

X

n x l x l n

x x 1 2

1 2

l =1

Substituting this into the earlier expression gives

1 1

X X

P w x l x l n exp iw n

12 1 2

n=1

l =1

By noting that

exp iw n exp iw l exp iw k

where k l n we can see that the CSD splits into the pro duct of two integrals

P w X w X w

12 1 2

where

1

X

X w x l exp iw l

1 1

l =1

1

X

X w x k exp iw k

2 2

k =1



For real signals X w X w where denotes the complex conjugate Hence

2

2

the cross sp ectral density is given by



P w X w X w

12 1

2

This means that the CSD can b e evaluated in one of two ways i by rst estimating

the crosscovariance and Fourier transforming or ii by taking the Fourier transforms

of each signal and multiplying after taking the conjugate of one of them A numb er

of algorithms exist which enhance the sp ectral estimation ability of each metho d

These algorithms are basically extensions of the algorithms for PSD estimation for

example for typ e i metho ds we can p erform BlackmanTukey windowing of the

crosscovariance function and for typ e ii metho ds we can employ Welchs algorithm

for averaging mo died p erio dograms b efore multiplying the transforms See Carter

for more details

The CSD is complex

The CSD is complex b ecause the crosscovariance is asymmetric the PSD is real

b ecause the autocovariance is symmetric in this sp ecial case the Fourier transorm

reduces to a cosine transform

Signal Pro cessing Course WD Penny April

More than two time series

The characteristics of a multivariate timeseries may b e summarised

by the p ower sp ectral Marple page For d time series

P f P f P f

11 12 1d

C B

P f P f P f

C B

12 22 2d

C B P f

A

P f P f P f

1d 2d dd

where the diagonal elements contain the sp ectra of individual channels and the o

diagonal elements contain the crosssp ectra The matrix is called a

b ecause the elements are complex numb ers

Coherence and Phase

The complex coherence function is given by Marple p

P f

ij

q

q

r f

ij

f P f P

j j ii

The coherence or mean squared coherence MSC b etween two channels is given by

2

r f j r f j

ij ij

The phase sp ectrum b etween two channels is given by

I mr f

ij

1

f tan

ij

R er f

ij

The MSC measures the linear correlation b etween two time series at each frequency

and is directly analagous to the squared correlation co ecient in linear regression

As such the MSC is intimately related to linear ltering where one signal is viewed

as a ltered version of the other This can b e interpreted as a linear regression at

each frequency The optimal regression co ecient or linear lter is given by

P f

xy

H f

P f

xx

This is analagous to the expression for the regression co ecient a see rst

xy xx

lecture The MSC is related to the optimal lter as follows

P f

xx

2 2

r f jH f j

xy

P f

y y

2 2

which is analagous to the equivalent expression in linear regression r a

xx y y

Signal Pro cessing Course WD Penny April

At a given frequency if the phase of one signal is xed relative to the other then the

signals can have a high coherence at that frequency This holds even if one signal is

entirely out of phase with the other note that this is dierent from adding up signals

which are out of phase the signals cancel out We are talking ab out the coherence

between the signals

At a given frequency if the phase of one signal changes relative to the other then

the signals will not b e coherent at that frequency The time over which the phase

relationship is constant is known as the coherence time See for an example

Welchs metho d for estimating coherence

Algorithms based on Welchs metho d such as the cohere function in the matlab

system identication to olb ox are widely used The signal is split up into a

numb er of segments N each of length T and the segments may b e overlapping The

complex coherence estimate is then given as

P

N

n n 

X f X f

n=1 i j

q

q

r f

ij

P P

N N

n n

2 2

f f X X

i j

n=1 n=1

where n sums over the data segments This equation is exactly the same form as for

estimating correlation co ecients see chapter Note that if we have only N

data segment then the estimate of coherence will b e regardless of what the true

value is this would b e like regression with a single data p oint Therefore we need

a numb er of segments

Note that this only applies to Welchtyp e algorithms which compute the CSD from a

pro duct of Fourier transforms We can tradeo go o d sp ectral resolution requiring

large T with lowvariance estimates of coherence requiring large N and therefore

small T To an extent by increasing the overlap b etween segments and therefore

the amount of computation ie numb er of FFTs computed we can have the b est of

b oth worlds

MAR mo dels

Just as the PSD can b e calculated from AR co ecients so the PSDs and CSDs can

b e calculated from MAR co ecients First we compute

p

X

Af I a exp ik f T

k

k

where I is the f is the frequency of interest and T is the

p erio d Af will b e complex This is analagous to the denominator in the equivalent

P

p

AR expression a exp ik f t Then we calculate the PSD matrix as

k

k =1

follows Marple page

1 H

P f T Af C Af M AR

Signal Pro cessing Course WD Penny April

1 1

0.9 0.9 0.8 0.8 0.7

0.6 0.7

0.5 0.6 0.4 0.5 0.3 0.4 0.2

0.1 0.3

0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70

a b

Figure Coherence estimates from a Welchs periodogram method and b Mul

tivariate Autoregressive model

where C is the residual covariance matrix and H denotes the Hermitian transp ose

This is formed by taking the complex conjugate of each matrix element and then

applying the usual transp ose op erator

T H

Just as A denotes the transp ose of the inverse so A denotes the Hermitian

transp ose of the inverse Once the PSD matrix has b een calculated we can calculate

the coherences of interest using equation

Example

To illustrate the estimation of coherence we generated two signals The rst x b eing

a Hz sine wave with additive Gaussian noise of and the

second y b eing equal to the rst but with more additive noise of the same standard

deviation Five seconds of data were generated at a sample rate of Hz We

then calculated the coherence using a Welchs mo died p erio dogram metho d with

N samples p er segment and a overlap b etween segments and smo othing

via a Hanning window and b an MAR mo del Ideally we should see a coherence

near to at Hz and zero elsewhere However the coherence is highly nonzero at

other frequencies This is b ecause due to the noise comp onent of the signal there

is p ower and some crossp ower at all frequencies As coherence is a ratio of cross

p ower to p ower it will have a high variance unless the numb er of data samples is

large

You should therefore b e careful when interpreting coherence values Preferably you

should p erform a signicance test either based on an assumption of Gaussian signals

or using a MonteCarlo metho d See also the text by Blo omeld

Signal Pro cessing Course WD Penny April

Partial Coherence

There is a direct analogy to Given a target signal y and other

signals x x x we can calculate the error at a given frequency after including

1 2 m

k m variables E f The partial coherence is

m

E f E f

m1 m

k f

m

E f

m1

See Carter for more details