Detection of Signals by Information Theoretic Criteria

IEEETRANSACTIONS ONACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL. ASSP-33,NO. 2, APRIL 1985 387 Detection of Signals by Information Theoretic Criteria MAT1 WAX AND THOMAS KAILATH, FELLOW, IEEE Abstract-A new approach is presented to the problem of detecting are discussed in Sections IV and V, respectively. Simulation the number of signals in a multichannel time-series, basedon the appli- results that illustrate the performance of the new method for cation of the information theoretic criteria for model selection introduced by Akaike (AIC) and by Schwartz and Rissanen (MDL). Unlike sensor array processing aredescribed in Section VI. Fre- the conventional hypothesis testing based approach, the new approach quency domain extensions and some concluding remarks are does not require any subjective threshold settings; thenumber of signals presented in SectionsVI1 and VIII, respectively. is obtained merely by minimizing the AIC or the MDL criteria. Siula- tion results that illustrate the performance of the new method for the detection of the number of signalsreceived by a sensor array are 11. FORMULATIONOF THE PROBLEM presented. The observation vector in certain important problems in signal processing such as sensor array processing [15] , [20] , [5] , I. INTRODUCTION [6] , [lo] , [ 121 , [22] , [25] , harmonic retrieval [15] , [12] , pole retrieval from the natural response [ll], [24], and re- N many problems in signal processing, the vector of observa- trieval of overlapping echos from radar backscatter [7] , de- tions can be modeled as a superposition of a finite number I noted by the p X l vector x(t), is successfully described by the of signals embedded in an additive noise. This is the case, for following model example, in sensor array processing, in harmonic retrieval, in retrieving the poles of a system from the natural response, and 4 in retrieving overlapping echoesfrom radar backscatter. A key x(t) = A(q) Sj(t) i-n(t) issue in theseproblems is thedetection of thenumber of i= 1 signals. where One approach to this problem is based on the observation si( e) = scalar complex waveform referredto as the ithsignal that the number of signals can be determined from the eigen- A(@i)=a p X 1 complex vector, parameterized by an un- values of the covariancematrix of the observationvector. known parameter vector aiassociated with the ithsignal Bartlett [4] and Lawley [14]developed a procedure, based n( -)=a p X 1 complex vector referred to as the additive on a nFsted sequence of hypothesis tests, to implement this ap- noise. proach. For each hypothesis, the likelihood ratio statistic is We assume that the q (q <p) signals sl ( e), * ,sq( .) are computed andcompared to athreshold; the hypothesis ac- complex (analytic), stationary, and ergodic Gaussian random cepted is the first one for which the threshold is crossed. The processes, with zero mean and positive definite covariancema- problem with this method is the subjective judgment required trix. The noise vector n( -)is assumed to be complex, station- for deciding on thethreshold levels. ary, and ergodic Gaussian vector process, independent of the In this paper we present a new approach to the problem that signals, with zero mean and covariance matrix given by a21, is based on the application of the information theoretic cri- where a2 is an unknown scalar constant and Z is the identity teria for model selection introduced by Akaike (AIC) and by matrix. Schwartzand Rissanen (MDL). Theadvantage of thisap- A crucial problem associated with the, model described in proach is that no subjective judgment is required in the deci- (1) is that of determining the number of signals 4 from a finite sion process; the number of signals is determined as the value set ofobservations x(tl), ,x(tN). for which the AIC or the MDL criteria is minimized. A promising approach to this problem is based on the struc- The paper is organized as follows. After the statement and ture of the covariance matrix of the observation vector x( .). formulation of the problemin Section 11, theinformation To introduce this approach,we first rewrite (1) as theoretic criteria for model selection are introduced in Section 111. The application of these criteria to the problem of detect- x(t) = As(t) + n(t) (2a) ing the number of signals and the consistency of these criteria where A is the p X 4 matrix Manuscript received May 19, 1983; revised June 1, 1984. This work A = [A(%) * * ‘N@q>I (2b) was supported in part by the Air Force Office of Scientific Research, Air Force Systems Command, under Contract AF 49-620-79-C-0058, and s(t) is the 4 X 1 vector the U.S. ArmyResearch Office, underContract DAAG29-79-C-0215, and by the Joint Services Program at Stanford University under Con- ST@) = [s, (t) * ’ * sq(t)] . tract DAAG29-81-K-0057. The authors are with the Information Systems Laboratory, Stanford Because the noise is zero mean and independent of the sig- University, Stanford, CA 94305. nals, it followsthat the covariance matrix of x( *) is given by 0096-3518/85/0400-0387$01.OO 0 1985 IEEE 388 IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL. ASSP-33, NO. 2, APRIL 1985 R = \k t aZI (34 used to encode the observed data, Rissanen proposed to select the model that yields the minimum code length. It turns out where that in the large-sample limit, both Schwartz’s and Rissanen’s \k = ASAi (3b) approaches yield the same criterion,given by with t denoting the conjugate transpose, and S denoting the MDL= -lOgf(XIS) t iklog N. (7) covariance matrix ofthe signals, i.e., S = E[s(.) s)~] s( . Note that apart froma factor of 2, thefirst term is identical to Assuming that the matrix A is of full column rank, i.e., the the corresponding one in the AIC, while the second term has vectors A(Qi) (i = 1, * , q) are linearly independent, and that an extra factor of3 log N. the covariance matrix of the signals S is nonsingular, it follows that the rank of Ik is q, or equivalently, the p - 4 smallest IV. ESTIMATINGTHE NUMBEROF SIGNALS eigenvalues of \E are equal to zero. Denoting the eigenvalues To applythe information theoretic criteria to detect the of R by AI ha . A, it follows, therefore, that the small- > .> number of signals, or equivalently, to determine the rank of est p - q eigenvalues of R are all equal to , i.e., d the matrix 9,we must first describe the family of competing - = Xq+1 =Aq+2 -... h, = 02. (4) models, or density functions that we are considering. Regard- ing the observations x(tl), * * * ,x(tN) as identical and statisti- The number of signals q can hence be determined from the caily independent complex Gaussian random vectors of zero muZtiplicity of the smallest eigenvalue of R. The problem is mean, the family of models is necessarily describedby the co- that the covariance matrix R is unknown in practice. When variance matrix of x( e). Since our model’s covariance matrix estimated from a finite sample size, the resulting eigenvalues is given by (3), it seems natural to consider the following fam- are all different with probability one, thus making it difficult ily of covariance matrices to determine the number of signals merely by “observing” the eigenvalues. A more sophisticated approach to the problem, R(k) = \kc4 + 021 (8) developed by Bartlett [4] and Lawley [14] , is based on a se- wheredenotes a semipositive matrix of rank k, and u de- quence of hypothesis tests. The problems associated with this notes an unknown scalar. Note that k E (0, l, - * . , p - l} approach is the subjective judgment needed in the selection of ranges over the set of all possiblenumber of signals. the threshold levels for the different tests. Using the well-known spectral representation theorem from In this paper we take a different approach. We pose the de- linear algebra, we can express R(k)as tection problem as a model selection problem and then apply the information theoretic criteria for model selection introduced by Akaike (AIC) and by Schwartz and Rissanen(MDL). 111. INFORMATION THEORETICCRITERIA where XI,. * , Xk and v,,* , Vk are the eigenvalues and The information theoretic criteria for model selection, intro- eigenvectors, respectively, of R@). Denoting by dk) the duced by Akaike [l] , [2], Schwartz [21], and Rissanen [17] parameter vector ofthe model,it follows that address the following general problem. Given a set of N observations X = {x(l), * * ,x(N)} and a family of models, that is, a parameterized family of probability densities f(XlO),select With this parameterizationwe now proceed to the derivation the model thatbest fits the data. of the information theoretic criteria for the detection prob- Akaike’s proposal was to select the model which gives the lem. Since the observations are regarded as statistically inde- minimum AIC, defined by pendent complex Gaussian random vectors with zero mean, AIC = -2 log f(Xl6) t 2k (6) their joint probability densityis given by where 6 is the maximum likelihood estimate of the parameter vector 0, and k. is tlie number of free adjusted parameters in 0. The first term is the well-known log-likelihood oftlie maximum likelihood estimator of the parameters of the model. The second term is a bias correction term, inserted so as to make the AIC an unbiasedestimate of the meanKulback- Taking the logarithm and omitting terms that do not depend Liebler distance betwey the modeled density and the f(Xl0) on the parameter vector dk),we find that the log-likelihood estimated densityf(Xl0). function L(@) is given by Inspired by Akaike’s pioneering work, Schwartz and Rissanen approached the.

Detection of Signals by Information Theoretic Criteria

Moving Average Filters

Random Signals

STATISTICAL FOURIER ANALYSIS: CLARIFICATIONS and INTERPRETATIONS by DSG Pollock

Use of the Kurtosis Statistic in the Frequency Domain As an Aid In

2D Fourier, Scale, and Cross-Correlation

20. the Fourier Transform in Optics, II Parseval’S Theorem

Introduction to Frequency Domain Processing

Time Domain and Frequency Domain Signal Representation

The Wavelet Tutorial Second Edition Part I by Robi Polikar

Image Analysis

Signals and Sampling

Cross-Correlation