<<

Exact Distribution and High-dimensional Asymptotics for Improperness Test of Complex Signals Florent Chatelain, Nicolas Le Bihan

To cite this version:

Florent Chatelain, Nicolas Le Bihan. Exact Distribution and High-dimensional Asymptotics for Improperness Test of Complex Signals. ICASSP 2019 - IEEE International Conference on Acoustics, Speech and , May 2019, Brighton, United Kingdom. pp.8509-8513, ￿10.1109/ICASSP.2019.8683518￿. ￿hal-02148428￿

HAL Id: hal-02148428 https://hal.archives-ouvertes.fr/hal-02148428 Submitted on 5 Jun 2019

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. EXACT DISTRIBUTION AND HIGH-DIMENSIONAL ASYMPTOTICS FOR IMPROPERNESS TEST OF COMPLEX SIGNALS

Florent Chatelain and Nicolas Le Bihan

Univ. Grenoble Alpes, CNRS, Grenoble INP†, GIPSA-Lab, 38000 Grenoble, France. †: Institute of Engineering Univ. Grenoble Alpes

ABSTRACT representation to study the Generalized Likelihood Ratio Test (GLRT). In the case, the notion of proper/improper ran- Improperness testing for complex-valued vectors and pro- dom can be phrased as follows: A complex random cesses has been of interest lately due to the potential ap- variable is called proper if it is uncorrelated with its complex plications of complex-valued time series analysis in several conjugate. Note that properness is a less general notion than research areas. This paper provides exact distribution charac- circularity which qualifies a complex whose terization of the GLRT (Generalized Likelihood Ratio Test) pdf is invariant by rotation in the complex plane. In the Gaus- for Gaussian complex-valued signals under the null sian setting where the variable distribution depends only on hypothesis of properness. This distribution is a special case second-order statistics, these two properties become equiva- of the Wilks’s lambda distribution, as are the distributions lent. In terms of real and imaginary parts of the complex vari- of the GLRT statistics in multivariate analysis of able, properness that both real and imaginary have the (MANOVA) procedures. In the high dimensional setting, same variance and that their cross- vanishes. The i.e. when the size of the vectors grows at the same rate as concept of properness extends to complex-valued vectors in a the number of samples, a closed form expression is obtained straight manner (see [6, 7]). for the of the GLRT statistics. This We consider N-dimensional complex-valued centered is, to our knowledge, the first exact characterization for the random vectors z = u + iv, i.e. u and v are N-dimensional GLRT-based improperness testing. real vectors with zero . In short, we have z ∈ CN , Index Terms— Complex signals, improperness, GLRT, u ∈ RN and v ∈ RN . The real vector representation of high-dimensional statistics, Wilks’s Lambda distribution N  T T T 2N z ∈ C consists in using x = u , v ∈ . The second order statistics of z ∈ CN are thus contained in the 1. INTRODUCTION real-valued covariance matrix C ∈ R2N×2N of x given by: C C  Complex-valued time series and vectors have attracted atten- C = uu uv (1) tion lately for their ability to model signals from a broad Cvu Cvv of applications including digital communications [1], seis- N×N where Cab ∈ R denotes the real-valued (cross)covariance mology [2], oceanography [3] or gravitational-waves T matrix between real vectors a and b, with Cba = Cab. [4] to name just a few. Among the specific features of com- A complex-valued Gaussian vector is called proper iff the plex datasets, the notion of properness (or second order cir- following two conditions hold: cularity [5]), in the Gaussian case, is of major importance as it relates invariance of the density to the T Cuu = Cvv and Cuv = −Cuv (2) correlation coefficients of complex vectors. The consequence of properness/circularity on complex random vectors and sig- If these conditions are not fulfilled, then z is called improper. nals statistics was studied at large extent in [6, 7]. Testing Testing for improperness of a complex Gaussian vector for improperness of complex vectors/signals was investigated was studied in the seminal work of Andersson [10] where by several authors [8, 9] in the signal processing community. authors used the 2N-dimensional real-valued representation However, it was indeed considered a long time ago by statis- of complex N-dimensional vectors. They derived the maxi- ticians [10]. In the signal processing literature, authors have mal invariant statistics for complex vectors and characterized mainly used the augmented complex representation - a twice the joint distribution of those statistics together with using bigger vector made of the concatenation of the complex vec- them for improperness testing. In [8, 1], authors proposed tor and its conjugate - while an equivalent real-valued repre- a GLRT test based on the augmented complex representation, sentation using real and imaginary parts was preferably used and highlighted its connection with canonical correlation co- by . In the sequel, we will make use of this latter efficients. Results showing the equivalence of the maximal in- variant statistics approach (derived from the 2N-dimensional Lemma 2.1. Any matrix C ∈ S can be written as: real-valued representation) with the canonical correlation co- I + D 0  efficients derived from the complex structure were obtained C = G N λ GT , in [9], together with a numerical study of the GLRT Barlett 0 IN − Dλ asymptotic distribution (in the large sample size case with where G ∈ G, I is the N × N identity matrix and D = small or fixed dimension of the complex vector z). N λ diag(λ , . . . , λ ) is an N ×N diagonal matrix. The diagonal The original contribution of the presented work is twofold. 1 N entries of D denoted as λ for 1 ≤ i ≤ N, are the non- It consists 1) in the identification of the exact distribution of λ i negative eigenvalues of the following 2N×2N real symmetric GLRT statistics under the null hypothesis of properness, matrix which reduces to a special case of Wilks’s lambda distribu- tion, and 2) in the derivation of the asymptotic distribution − 1 − 1 Γ(C) = C˙ 2 C¨ C˙ 2 . for the GLRT statistics in the high dimensional case (vector and sample sizes growing at the same rate). They satisfy the following properties: 1) λi and −λi, for 1 ≤ i ≤ N, form the set of eigenvalues of the 2N × 2N matrix 2. TESTING FOR IMPROPERNESS Γ(C), and 2) λi ∈ [0, 1] with, by convention, the following ordering 1 ≥ λ1 ≥ ... ≥ λN ≥ 0. 2.1. Testing problem Proof. See [10, lemma 5.1 and 5.2].

In several applications, it is common use to model the sig- Lemma 2.1 shows that any invariant parameterization of nal/vector of interest, denoted z, as being improper and cor- the covariance matrix C for the group action of G depends rupted by proper . Consequently, statistical tests have only on the N (positive) eigenvalues 1 ≥ λ1 ≥ ... ≥ λN ≥ been proposed to investigate the properness/improperness of 0. Thus these eigenvalues are termed as maximal invariant a signal given a sample, from which one will decide: [11, Chapter 6]. Moreover under the null hypoth- ¨ ( esis H0, it comes that λ1 = ... = λN = 0 as C reduces to H : z is proper if condition (2) holds 0 (3) the zero matrix according to (2). Within the invariant param- H1 : z is improper otherwise eterization, the testing problem in (3) becomes ( H : λ = 0, for 1 ≤ i ≤ N, 2.2. Invariant parameters 0 i (4) H1 : λi ≥ 0, for 1 ≤ i ≤ N. Let G be the set of nonsingular matrices G ∈ R2N×2N s.t. Note that the invariance property ensures that the test does not G −G  depend on the (common) representation basis of the real and G = 1 2 , G2 G1 imaginary parts of z, i.e. vectors u and v.

N×N where G1, G2 ∈ R . Let S be the set of all 2N × 2N 2.3. Invariant statistics real symmetric definite positive matrices. According to the Consider we have a sample of size M, denoted X = {x }M , test formulation (3) and condition (2), the null hypothesis H0 m m=1 T T T is equivalent to C ∈ T = S ∩ G. where xm = [um, vm] are 2N-dimensional i.i.d. Gaussian As explained in [10], G is a group (isomorphic to the real vectors with zero mean and covariance matrix C. In the Gaussian framework, a sufficient statistics is given by the group GLN (C) of nonsingular N × N complex matrices un- 2N × 2N sample covariance matrix der the mapping G ↔ G1 + iG2). Moreover G acts transi- tively on T under the action (G, T) ∈ G × T 7→ GTGT ∈ S S  T . Thus, a parametric characterization of H should be in- S = uu uv 0 S S variant to this group action: the value of the parameters to be vu vv T tested should be the same for C and GCG for any G ∈ G. N×N with Sab ∈ R the real-valued sample (cross)covariance We introduce the following decomposition C = C˙ + C¨ for M M matrix of real vectors {am}m=1 and {bm}m=1 such that any C ∈ S where 1 PM T Sab = M m=1 ambm. We assume here that M ≥ 2N,   thus S belongs to the real symmetric definite positive matrix ˙ 1 Cuu + Cvv Cuv − Cvu C = ∈ G, set S. According to previous section, since H0 is invariant 2 Cvu − Cuv Cuu + Cvv   under the action of group G, an invariant test statistics must 1 Cuu − Cvv Cuv + Cvu C¨ = . only depend on the N positive eigenvalues li, 1 ≤ i ≤ N, C + C C − C − 1 − 1 2 uv vu vv uu of Γ(S) = S˙ 2 S¨S˙ 2 . These sample eigenvalues obey 1 ≥ l1 ≥ ... ≥ lN ≥ 0 according to lemma 2.1, and are an estimate of the true eigenvalues λi obtained from the the ML estimate of C under H0 reduces to S˙ , as shown for population covariance C. Note that the λi are zero under instance in [10]. Actually, the GLRT statistics is expressed the null hypothesis H0, and non-negative otherwise. As a as: consequence, the distribution of the l should be stochasti- i 1 1 ˙ ˙ 2 ˙ 2 ˙ cally greater under H1 than under H0. Any invariant test can T = |S|/|S| = |S (I2N + Γ(S)) S |/|S| = |I2N + Γ(S)| , be derived from this property. A key point to derive now a N N Y Y tractable statistical test procedure is to characterize the null = (1 + ln)(1 − ln) = (1 − rn), (6) distribution of these eigenvalues. n=1 n=1 1 1 Let BN ( n1, n2) denote the N ×N-dimensional matrix 2 2 where the first equality in the first is due to the Gaus- variate with parameters n1 and n2 as defined for instance in [12, definition 3.3.2, p. 110]. sian pdf expression, the second equality comes from the de- composition S = S˙ + S¨ and the definition of Γ(S), the first Proposition 2.1.1. Under H0, the vector (r1, . . . , rN ) of equality in the second line comes from lemma 2.1, and where 2 2 the squared sample eigenvalues rn = ln is distributed rn = ln, 1 ≤ n ≤ N, are the squared sample eigenvalues. As as the eigenvalues of the matrix variate beta distribution explained in the previous section, it is important to note that 1 1 BN ( 2 n1, 2 n2), with parameters n1 = N + 1 and n2 = the GRLT is invariant: the resulting statistics given in (6) only M − N. Moreover, the joint pdf of (r1, . . . , rN ) is expressed depends on the eigenvalues of Γ(S). as:

N N 3.2. Distribution under the hypothesis H0 of properness Y (M−2N−1)/2 Y p(r1, . . . , rN ) ∝ (1 − rn) (rk − rn), n=1 k

N N T ∼ Λ(N,M − N,N + 1). Y 2 (M−2N−1)/2 Y 2 2 p(l1, . . . , lN ) ∝ (2ln)(1 − ln) (lk − ln). Moreover this statistics can be expressed under H0 as n=1 k

N A simple change of variables yields the pdf of (r1, . . . , rN ) Y given in (5). Moreover, according to [12, Theorem 3.3.4, T = un, (7) p. 112], (5) is the pdf of the eigenvalues of the matrix vari- n=1 ate beta distribution B ( N+1 , M−N ), which concludes the N 2 2 where the un are independent beta-distributed random vari- proof. M−N−n+1 N+1  ables such that un ∼ B 2 , 2 , for 1 ≤ n ≤ N.

3. GENERALIZED LIKELIHOOD RATIO TEST Proof. According to Prop. 2.1.1, the rn in (6) are dis- 3.1. Expression of the GLRT statistic tributed as the eigenvalues of the matrix variate beta dis- 1 1 tribution BN ( n1, n2) with parameters n1 = N + 1 and A very classical procedure to test for improperness is obtained 2 2 n2 = M − N. Using the mirror property of the from the Generalized Likelihood Ratio Test (GLRT) statistic beta distribution, it comes that the (1 − rn) are distributed as defined as: 1 1 the eigenvalues of the U ∼ BN ( 2 n2, 2 n1). sup p(X ; C) According now to [12, Theorem 3.3.3, p. 110], U can be T T ∝ C s.t. H0 , decomposed as U = Θ Θ where Θ is upper triangular sup p(X ; C) with diagonal entries θnn that are independent and where C s.t. H1 2 n2−n+1 n1  un ≡ θnn ∼ B 2 , 2 for 1 ≤ n ≤ N. This con- where p(X ; C) is the multivariate normal pdf of the sample cludes the proof. X composed of M i.i.d. 2N-dimension real Gaussian vectors with zero mean and covariance matrix C. Under H1, C ∈ S Equation (7) gives also a more efficient way to sample is a symmetric definite positive matrix. It is well known that from the null distribution of T in O(N) independent draws. its maximum likelihood (ML) estimate is the sample covari- This does not require to generate the 2N ×2N sample covari- ance S. Under H0, it comes that C = C˙ as C ∈ T . Then ance matrix S, nor to compute the eigenvalues of Γ(S). 3.3. High-dimensional asymptotic distribution under H0 4. SIMULATION RESULTS

The characterization given in (7) allows us to derive, under the Several simulations have been conducted to appreciate the ac- null hypothesis H0, an asymptotic distribution for the GLRT curacy of the asymptotic null distribution given in Thm. 3.2 statistic T in the high dimensional (i.e. large N) case. This with respect to 1) the sample size M and 2) the dimension N, yields a simple tractable closed form approximation of the M or equivalently the ratio γ = N . This approximation is also considered Wilks’s lambda distribution when both the dimen- compared with the classical Bartlett one derived for Wilks’s sion N and the sample size M are large. lambda distribution [13, p. 94]. This gives, when the dimen- Theorem 3.2. Let T 0 = − ln T where T is the GLRT statistic sion N is fixed while M goes to infinity, the same asymptotic given in (6). Assume that M,N → ∞ so that the as obtained in [9]: M/N → γ ∈ (2, +∞). Under H0, the following asymptotic −(M − N) ln T −→d χ2 , (8) is obtained for T 0 N(N+1) where χ2 denotes the chi-squared distribution with 1 0 d N(N+1) (T − mM ) −→N (0, 1) N(N + 1) degrees of freedom. sM

where 0.07 0.07 0.06 0.06 0.05 0.05  γ γ − 2 γ − 2 1 γ 0.05 0.04 0.04 0.04 0.03 mM = M ln + ln + ln , 0.03 0.03 0.02 γ − 1 γ γ − 1 2 γ − 2 0.02 0.02 0.01  2  0.01 0.01 (γ − 1) 1 1 0.00 0.00 0.00 2 0.00 0.01 0.02 0.03 0.04 0.05 0.00 0.01 0.02 0.03 0.04 0.05 0.00 0.01 0.02 0.03 0.04 0.05 sM = 2 ln + . γ(γ − 2) M γ − 2 (a) γ = 10,M = 10 (b) γ = 10,M = 100 (c) γ = 10,M = 1000

0.07 N 0.07 0 P 0.06 0.4 Proof. According to theorem 3.1, T = ζ where the 0.06 n=1 n 0.05 0.05 0.3 0.04 ζn are independent random variables such that ζn = − ln un 0.04 M−N−n+1 N+1  0.03 0.03 0.2 0.02 0.02 with un ∼ B , for 1 ≤ n ≤ N. Based 0.1 2 2 0.01 0.01 on the centered moments of a logarithmically transformed 0.00 0.00 0.0 0.00 0.01 0.02 0.03 0.04 0.05 0.00 0.01 0.02 0.03 0.04 0.05 0.00 0.01 0.02 0.03 0.04 0.05 beta-distributed variable as given in [14], it comes that (d) γ = 5,M = 10 (e) γ = 5,M = 100 (f) γ = 5,M = 1000 E[ζn] = ψ(an + b) − ψ(an) where ψ(·) is the , and var[ζn] = ψ1(an) − ψ1(an + b) where 0.8 1.0 ψ (·) is the trigamma function. Using ex- 0.10 1 0.8 0.08 0.6 0.6 pansions of the digamma and trigamma functions, straight- 0.06 0.4 0.4 0.04 forward computations, omitted here for the sake of brevity, 0.2 0 PN 0.02 0.2 yield that E[T ] = n=1 E[ζn] = mM + O(1/M) and 0.00 0.0 0.0 0.00 0.01 0.02 0.03 0.04 0.05 0.00 0.01 0.02 0.03 0.04 0.05 0.00 0.01 0.02 0.03 0.04 0.05 var(T 0) = PN var(ζ ) = s2 + O(1/M 2). n=1 n M (g) γ = 2.5,M = 10 (h) γ = 2.5,M = 100 (i) γ = 2.5,M = 1000 In order to apply Lyapunov central theorem [15, p. N 0 X Fig. 1: Comparison of asymptotic approximations for the GLRT 362] to T = ζn, it is sufficient to show that statistics T : Pr(T > qα) under H0 vs the nominal control level α n=1 −3 −1 in [10 , 5×10 ] where qα is the 1−α either for the log- N normal approximation given in theorem 3.2, shown in solid blue line, 1 X h 4i E (ζ − E[ζ ]) → 0. or the Bartlett approximation (8), shown in dashed red line. (a)-(c) var(T 0)2 n n are for γ = M/N = 10 and M = 10, 100, 1000 respectively, (d)- n=1 (f) are likewise for γ = M/N = 5 and (g)-(i) for γ = M/N = 2.5. The black dashdotted line represents the y = x values. The expression of the fourth order centered of ζn h 4i 2 gives now that E (ζn − E[ζn]) = O 1/(M − n + 2) Fig. 1 depicts, for different values of M and γ, a 0 2 2 for 1 ≤ n ≤ N. As var(T ) = sM + O(1/M ) = O(1), the probability-probability of the theoretical null distribution previous Lyapunov sufficient condition holds, and of T against each one of these asymptotic approximations. A

N from the y = x line indicates a difference between 1 X d the theoretical and the asymptotic distributions. This shows Z ≡ (ζn − E[ζn]) −→N (0, 1). p 0 var(T ) n=1 that as expected for high-dimensional setting (e.g., γ ≤ 5) and/or large sample sizes (e.g., M ≥ 1000), the asymptotic 1 0 By noting finally that (T − mM ) = Z + O(1/M), Slut- distribution that we derived becomes very accurate and much sM sky’s theorem allows us to conclude the proof. better than Bartlett one. 5. REFERENCES [14] S. Nadarajah and S. Kotz, “The beta exponential distri- bution,” Reliability Engineering & System Safety, vol. [1] P.J. Schreier and L.L. Scharf, Statistical Signal Process- 91, no. 6, pp. 689 – 697, 2006. ing of Complex-Valued Data: The Theory of Improper and Noncircular Signals, Cambridge University Press, [15] P. Billingsley, Probability and Measure, Wiley Series 2010. in Probability and Statistics. Wiley, 1995. [2] A. M. Sykulski, S. C. Olhede, and J. M. Lilly, “A widely linear complex autoregressive process of order one,” IEEE Transactions on Signal Processing, vol. 64, no. 23, pp. 6200–6210, 2016. [3] A. M. Sykulski, S. C. Olhede, J. M. Lilly, and J. J. Early, “Frequency-domain modeling of stationary bivariate or complex-valued signals,” IEEE Transac- tions on Signal Processing, vol. 65, no. 12, pp. 3136– 3151, 2017. [4] J. Flamant, P. Chainais, E. Chassande-Mottin, F. Feng, and N. Le Bihan, “Non-parametric characterization of gravitational-wave polarizations,” in XXV European Signal Processing Conference (EUSIPCO), Roma, Italy, 2018. [5] B. Picinbono, “On circularity,” IEEE Trans. on Signal Processing, vol. 42, no. 12, pp. 3473–3482, 1994. [6] P.-O. Amblard, M. Gaeta,¨ and J.-L. Lacoume, “Statis- tics for complex variables and signals, Part I,” Signal Processing, vol. 53, pp. 1–13, 1996. [7] P.-O. Amblard, M. Gaeta,¨ and J.-L. Lacoume, “Statis- tics for complex variables and signals, Part II,” Signal Processing, vol. 53, pp. 15–25, 1996. [8] P. J. Schreier, L. L. Scharf, and A. Hanssen, “A gen- eralized likelihood ratio test for impropriety of complex signals,” IEEE Signal Processing Letters, vol. 13, no. 7, pp. 433 – 436, 2006. [9] A.T. Walden and P. Rubin-Delanchy, “On testing for improperty of complex-valued gaussian vectors,” IEEE Transactions on Signal Processing, vol. 57, no. 3, pp. 21–51, 2009. [10] S.A. Andersson and M.D. Perlman, “Two testing prob- lems relating the real and complex multivariate normal distributions,” Journal of Multivariate Analysis, vol. 15, pp. 21–51, 1975. [11] E. L. Lehmann and Joseph P. Romano, Testing statis- tical hypotheses, Springer Texts in Statistics. Springer, New York, third edition, 2005. [12] R.J. Muirhead, Aspects of Multivariate Statistical The- ory, Wiley-Interscience, 2005. [13] K. V. Mardia, J. T. Kent, and J. M. Bibby, Multivariate analysis, Academic Press London ; New York, 1979.