Publication VIII

Jan Eriksson, Esa Ollila, and Visa Koivunen. 2009. for complex random variables revisited. In: Proceedings of the 34th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009). Taipei, Taiwan. 19•24 April 2009, pages 3565•3568.

© 2009 Institute of Electrical and Electronics Engineers (IEEE)

Reprinted, with permission, from IEEE.

This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Aalto University School of Science and Technology's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs•[email protected].

By choosing to view this document, you agree to all provisions of the copyright laws protecting it. STATISTICS FOR COMPLEX RANDOM VARIABLES REVISITED

Jan Eriksson∗†, Esa Ollila∗, and Visa Koivunen

Helsinki University of Technology, SMARAD CoE, Signal Processing and Acoustics, P.O. Box 3000, FIN-02015 TKK, Finland {jamaer,esollila,visa}@wooster.hut.fi

ABSTRACT 2. COMPLEX FIELD AND FUNCTIONS

Complex random signals play an increasingly important role in ar- Let R denote the set of real numbers. The set of complex numbers, 2 ray, communications, and biomedical signal processing and related denoted by C, is the plane R × R = R equipped with complex fields. However, the mathematical foundations of complex-valued addition and complex multiplication making it the complex field. The signals and tools developed for handling them are scattered in litera- complex conjugate of z =(x, y)=x + jy ∈ C is defined as ∗ ture. There appears to be a need for a concise, unified, and rigorous z  (x, −y)=x − jy. With this notation we can write the real treatment of such topics. In this paper such a treatment is provided. part and the imaginary part of a complex number z as Re(z)  Moreover, we establish connections between seemingly unrelated 1 ∗  j ∗ − x = 2 (z + z ) and Im(z) y = 2 (z z), respectively. The objects such as real differentiability and circularity. In addition, a modulusp of z = x +√jy is defined as the nonnegative real number novel complex-valued extension of Taylor series is presented and a |z|  x2 + y2 = zz∗. measure for circularity is proposed. The complex exponential, denoted by exp(z), is defined as Index Terms— complex random variables, complex moments, the complex number exp(z)  exp(x){cos(y)+j sin(y)}, R-differentiability, circularity, Taylor’s series where exp(x) for real-valued x denotes the usual exponential function. Any nonzero complex number has a polar represen- tation, z = |z| exp(jθ), where θ = arg(z) is called the argu- 1. INTRODUCTION ment of z. The unique argument θ = Arg(z) on the interval −π ≤ θ<πis called the principal argument. The complex loga- As the applications of complex random signals are becoming in- rithm of z =0 , denoted by log(z), is defined as the complex number creasingly more advanced, it is evident that simplistic adaptation of log(z)  log |z| + jArg(z)+j2nπ where n is an arbitrary integer. techniques developed for real-valued signals to the complex-valued The particular value of the logarithm given by log |z| + jArg(z) is case may not be adequate, or may lead to suboptimal results or in- called the principal logarithm and will be denoted by Log(z). tractable calculations. This issue is now widely recognized in the The open disk with center c = a + jb ∈ C and radius r>0 signal processing research community. Unfortunately, the obtained is defined as B(c, r)  {z ∈ C : |z − c| 0 such that B(c, r) ⊂U.Afunction f of the complex valued random signals and related processing tools variable z = x + jy is a rule that assigns to each value z in U We introduce a novel complex-valued extension of the Taylor one and only one complex number w = u + jv  f(z). The real series, we establish relationship between the moments, characteristic and imaginary part of the function f(z) are real-valued functions of functions and cumulants, and finally we propose a novel measure real variables x and y, i.e., u = u(x, y)  u(z):U→R and of circularity. For the recent work in the area of complex valued v = v(x, y)  v(z):U→R. random signals, see, e.g., [1, 2, 3] and references therein. Because C is an R-vector space as well as a C-vector space, The rest of this paper is organized as follows. In Section 2 there are two kinds of linear functions. A function L : C → C is the preliminaries of the complex field are considered, and the linear called R-linear,ifL(az1 +bz2)=aL(z1)+bL(z2) for all z1,z2 ∈ maps are revisited. Linearity considerations lead to two types of dif- C, and scalars a, b ∈ R. If the defining equation is valid also for ferention in the complex domain. This is treated in Section 3, where all complex scalars a, b ∈ C, then L is C-linear. For example, the ∗ it is also derived a generalization of the standard complex Taylor’s complex conjugation z → z is R-linear but not C-linear. series along with important specific cases. In Section 4 complex It is easily seen that a function L is R-linear if and only if ∗ random variables are introduced, and the existence properties of the L(z)=αz + βz , α, β ∈ C, and a function T is C-linear if and fundamental expectation operator are derived. It is shown in Sec- only T (z)=αz. This result can be generalized to multivariate tion 5 how the results form the previous sections lead to derivation mappings Cn → Cp, and the form of R-linear mappings remains and characterization of complex quantities such as the same: it is the sum of two C-linear mappings, the first one acting moments, cumulants, and circularity. Finally, Section 6 concludes on the argument of the function and the latter on the conjugate of the paper. Due to the lack of space we have omitted some of the the argument. Therefore, the well-known estimation technique in proofs, which are available from the authors by a request. signal processing known as widely linear estimation [4] is in fact estimation with respect to R-linear estimators instead of the C-linear ∗ The two first authors have equally contributed to the technical content ones used in standard linear estimation of complex data. Hence, we of the paper. † This work has been supported by the Academy of Finland. prefer the name R-linear estimation for the technique.

978-1-4244-2354-5/09/$25.00 ©2009 IEEE 3565 ICASSP 2009 m+1 3. CR-CALCULUS AND TAYLOR’S THEOREM Theorem 1 (Taylor’s R-theorem) Assume that f ∈ C (U). Then The essential idea of differentiation is linearization of a function around a point. Since there two types of linear mappings in C, f(c + h) − f(c) there are also two types of differentiations. The less restrictive is Xm Xp ∗n p−n p h h ∂ f m (2) obtained by linearization with respect to R-linear functions. A func- = (c)+|h| ε(h) n!(p − n)! ∂zp−n∂z∗n tion f : U→C is said to be R-differentiable at c ∈U, if there exists p=1 n=0 an R-linear function L : C → C, called R-differential, such that and lim|h|→0 ε(h)=0. f(c + h) − f(c)=L(h)+|h|ε(h) and lim ε(h)=0. |h|→0 An infinitenely continuously differentiable function f is real ana- lytic, if the power series in (2) converges. The function f is R-differentiable,ifitisR-differentiable at every ∂f An import special case occurs when ∗ ≡ . This condition is ∈U C C ∂z 0 point c . Now the differential L is -linear if and only if f is - easily seen to be equivalent to Cauchy-Riemann equations. There- C differentiable, i.e., f has the usual complex derivative ( -derivative) fore, f is holomorphic, infinitely C-differentiable, and the power defined as the differential quotient limit. series (2) converges, i.e., f is complex analytic. Taylor’s series takes Functions that are C-differentiable in a domain are called holo- − Pnow the usual form from the complex analysis: f(c + h) f(c)= morphic, and they are the subject of the standard complex analy- ∞ hp ∂pf C p=1 p! ∂zp (c). sis. However, -differentiability is too stringent condition for the R application domain targeted in this paper. Namely, many functions Another special case occurs when f is -differentiable at c = 0, and further independent of Arg(z). Now, since f can be writ- related to random variables and optimization are real-valued, and 2 ten as a function of |z| alone, it follows from the chain rules that only real-valued holomorphic functions are constants. Hence, we p+q ∂ f  ∈ 2m+1 U are mainly interested in developing a useful framework around R- ∂zpz∗q (0) = 0 whenever p = q. Hence, if f C ( ), then differentiation. Eq. (2) reduces to

Now the existence of the (unique) differential L(h) implies that m ∂f X 2p  ∂u ∂v |h| p 2m the complex partial derivatives (cpd) ∂x (c) ∂x (c)+j ∂x (c) and f(h) − f(0) = Δ f(0) + |h| ε(h), (3) ∂f  ∂u ∂v (p!)24p ∂y (c) ∂y (c)+j ∂y (c) exist at c. Furthermore, p=1

∂f ∂f  ∂ ∂f ∂ ∂f L(h)= (c)Re(h)+ (c)Im(h) where the differential polynomial Δ 4 ∂z∗ ∂z =4∂z ∂z∗ = ∂x ∂y ∂2 ∂2 (1) ∂x2 + ∂y2 is known [6] as the Laplace operator playing a crucial ∂f ∂f ∗ = (c)h + (c)h , role in analysis of harmonic (or potential) functions. ∂z ∂z∗ where the mixed cpd’s 4. COMPLEX RANDOM VARIABLES AND THE   EXPECTATION ∂f 1 ∂f ∂f (c)  (c) − j (c) ∂z 2 ∂x ∂y A complex random variable (r.v.) is defined by Z  X + jY , where X and Y are real r.v.’s. The distribution of a complex r.v. Z is and   identified with the joint (real bivariate) distribution of X and Y , ∂f  1 ∂f ∂f ∗ (c) (c)+j (c) ∂z 2 ∂x ∂y FZ (z)=P (Z ≤ z)  P (X ≤ x, Y ≤ y)=F(X,Y )(x, y), are called [5] the R-derivative and the conjugate R-derivative, respectively. The differential calculus based on these operators is where the functions FZ (z) and F(X,Y )(x, y) are the cumulative known as Wirtinger calculus [6, 1], or, as we prefer, the CR-calculus distribution function (cdf) of the complex r.v. Z and the joint ∗ [5]. The variables z and z are treated as though they were indepen- distribution function of the bivariate random vector (X, Y ), re- dent variables, and the usefulness of these operators stem from the spectively. The density function (pdf), if it exists,  fact that they follow formally the same sum, product, and quotient is the nonnegativeR functionR fZ (z) f(X,Y )(x, y) such that y x ∈ R2 rules as the ordinary partial differentiation. The chain rules read as F(X,Y )(x, y)= −∞ −∞ f(X,Y )(x, y)dxdy for all (x, y) . 2 2 ¡ ∂2F follows: ∂ FZ − ∂ FZ (X,Y ) Then 2j ∂z2 (z) ∂z∗2 (z) = ∂x∂y (x, y) exists and equals ∗ ◦ ¡ ¡ fZ (z) almost everywhere. ∂f g ∂f · ∂g ∂f · ∂g ¢ £ (c)= g(c) (c)+ ∗ g(c) (c),  ∂z ∂z ∂z ∂z ∂z ¢The£ mean ¢(or£ the expectation)ofZ, defined as EZ Z ¢ £ ◦ ¡ ¡ ∗ ∂f g ∂f · ∂g ∂f · ∂g EX X ¢+£j EY Y is said to exist if both real expectations EX X ∗ (c)= g(c) ∗ (c)+ ∗ g(c) ∗ (c). ∂z ∂z ∂z ∂z ∂z and EY Y exist, i.e., if the corresponding integrals are defined and finite. It follows from the definition of integration that integrabil- The CR-calculus extends easily to functions with several variables ity¢ and£ absolute integrability are¢ equivalent£ for a real r.v. X, i.e., by the multivariate extension of the differential property (1). The EX X exists if and only if EX |X| exist. Thus, the existence of literature about this multivariate extension is very scarce, however, ¢ £ ¢ £ ¢ £ EZ Z implies the existence of EX |X| and EY |Y | . For a real see [5, 7, 8]. ¢ £ ¢ £ | |≤ | | As with the usual partial derivatives, it is possible to have higher r.v. X it is well-known that EX ¢X £ EX X . The same¢ holds£ | | order R-derivatives. Functions with continuous R-derivatives of or- for complex r.v.’s.¢ Explicitly,£ EZ Z exists if and only if EZ Z m der m in U are denoted by C (U). We are ready to state the main exists, and if EZ f(¢Z) exists£ for¢ a Borel£ measurable function result in this section. f : C → C, then | EZ f(Z) |≤EZ |f(Z)| .

3566 For the characteristic function (cf) we may write Again there are p +1cumulants κn;m of order p, and symmetric ∗ ¢ £ cumulants are redundant in the sense that κn;m = κm;n. Since the  { } ΦZ (z) Φ(X,Y )(x, y)=E(X,Y ) exp j(xX + yY ) existence relations follow from the differentiation, they are the same ¢ £ ¢ £ ∗ j ∗ ∗ as in the case of moments. The symmetric cumulant κn;n is called =EZ exp{jRe(z Z)} =EZ exp{ (z Z + zZ )} , 2 the absolute (or central) cumulant of order 2n. and the second characteristic function (scf) is defined in the neigh- The usefulness of cumulants stems from the fact that the scf is borhood of zero by ΨZ (z)  Log[ΦZ (z)]. additive for independent r.v.’s. That is, if r.v.’s Z1 and Z2 are inde- pendent, then ΨZ1+Z2 (z)=ΨZ1 (z)+ΨZ2 (z). This property is 5. MOMENTS, CUMULANTS, AND CIRCULARITY preserved for the cumulants as the differentiation is a linear opera- tion. A complex r.v. Z has p +1p:th-order moments defined by For the estimation purposes, it is important to know the alge- braic relation between cumulants and moments. This relationship ¢ n ∗m£ αn;m  EZ Z Z , was introduced in [11] (see also [10]) in an ad-hoc manner by di- ∗ rectly substituting Z and Z into a cumulant operator derived for the where n and m are natural numbers such that p = n+m. Symmetric ∗ cumulants of real random vectors. However, the formula holds also moments are redundant in the sense that αm;n = αn;m.IfZ is real for complex r.v.’s, and it can be rigorously proved by extending the all p:th-order moments coincide.¢ £ The q:th-order absolute moment underlying Faa´ di Bruno’s formula [12] to complex R-derivatives.  | |q is defined by βq EZ Z for all rational numbers q. Clearly This also gives the explicit formulas for the relationship between β2n = αn;n. In an analogous fashion, the quantity defined by multivariate complex cumulants and moments. However, we do not ¢ ¢ £ n ¢ £ ∗m£ pursue this issue here. α  EZ {Z − EZ Z } {Z − EZ Z } n;m A complex r.v. Z is circular [13] if for any real α, the r.v.’s ¢ Z = X + jY and exp(jα)Z have the same distribution. In other is known as the p:th-order and β  EZ |Z − ¢ £ £ q words, Z is circular if the joint distribution of X and Y is spherically |q EZ Z as the q:th-order absolute central moment. Again there symmetric. It follows directly from the results in [14] that if a r.v. are p +1different central p:th-order moments. The 2nd-order cen- Z = |Z| exp(jΘ) is circular, then a uniformly on [0, 2π) distributed tral moment β2 is called the variance of Z and denoted by var(Z) r.v. Θ and a positive r.v. |Z| are independent. Furthermore, the cf 2 whereas the 2nd-order central moment α2;0 is the the pseudo- of a circular r.v. depends only on |z| , and therefore, assuming the variance [9] of Z and it is denoted by pvar(Z). It can be shown that moments exist, we can apply the specific case (3) of Taylor’s R-  ≤ if a p:th-order moment exists, then all moments of order p p ex- theorem to the (s)cf of the r.v Z. It follows that all offcentric (i.e., | |≤ ist. Moreover, we have αn;m βn+m. Now there is an important nonabsolute) moments (and cumulants) are identically zero. relationship between moments and cf’s [10] given in the following Since much of the classical signal processing and communica- theorem. tions work in the complex domain has assumed circular r.v.’s, ab- Theorem 2 If a p:th order moment of a complex r.v. Z exists, then solute moments (and cumulants) have dominated the literature until the cf ΦZ is p times continuously differentiable in C, and, moreover, recently. Recent literature, e.g. [3, 2, 1, 15], has focused on a weaker assumption than circularity. Namely, instead of presuming that r.v’s m+n  m+n ¢ £ ∂ ΦZ j n ∗m ∗ are fully circular, one only assumes that the second order offcen- (z)= EZ Z Z exp(jRe(z Z)) ∂zm∂z∗n 2 tric moments do not vanish, i.e., r.v.’s are assumed to improper [9]. ≤ However, since there is no a priori reason to assume that the higher for all natural numbers m, n such that m + n p. Especially, order offcentric moments vanish if the signal is known to be noncir-   m+n m+n cular, also higher order offcentric moments need to be considered for 2 ∂ ΦZ αn;m = m ∗n (0). (4) the optimal performance. For testing circularity, a generalized like- j ∂z ∂z lihood ratio was considered in [16]. Another circularity measure is m+n ∂ ΦZ obtained by calculating the difference between the cf and its circular Conversely, if the partial derivative ∂zm∂z∗n (0) of the order p = approximation (3): m + n exists, then all moments of order ≤ p,ifp is even, and all moments of order

3567 7. REFERENCES m are continuous in U by definition, which also implies that u and v posses m:th-order Taylor series expansion ! [1] H. Li and T. Adalı, “Complex-valued adaptive signal process- Xm Xp p ing using nonlinear functions,” EURASIP Journal on Advances − 1 p p−n n ∂ u u(c + h) u(c)= hx hy p−n n (c) in Signal Processing, vol. 2008, 2008. p=1 p! n=0 n ∂x ∂y m [2] P.J. Schreier, “Bounds on the degree of impropriety of complex + |h| εu(h), ! random vectors,” IEEE Signal Processing Letters, vol. 15, pp. Xm Xp p 1 p p−n n ∂ v 190–193, 2008. v(c + h) − v(c)= hx hy (c) p! n ∂xp−n∂yn [3] E. Ollila, “On the circularity of a complex random variable,” p=1 n=0 m IEEE Signal Processing Letters, vol. 15, pp. 841–844, 2008. + |h| εv(h), p 2 2 [4] B. Picinbono and P. Chevalier, “Widely linear estimation with where εu(h) → 0 and εv(h) → 0 as |h| = hx + hy → 0. complex data,” IEEE Transactions on Signal Processing, vol. Therefore we now have that 43, no. 8, pp. 2030–2033, Aug 1995. f(c + h) − f(c)=[u(c + h) − u(c)] + j[v(c + h) − v(c)] ! [5] K. Kreutz-Delgado, “The complex gradient operator and Xm Xp p the CR-calculus,” Lecture Notes Supplement [online], 2007, 1 p p−n n ∂ (u + jv) = hx hy p−n n (c) http://dsp.ucsd.edu/˜kreutz/. p=1 p! n=0 n ∂x ∂y m [6] R. Remmert, Theory of Complex Functions, Springer, 1991. + |h| [εu(h)+jεv(h)] ! Xm Xp p [7] D.H. Brandwood, “A complex gradient operator and its appli- 1 p p−n ∗n ∂ f | |m cation in adaptive array theory,” IEE Proc.-Pts. F and H, vol. = h h p−n ∗n (c)+ h ε(h), p=1 p! n=0 n ∂z ∂z 130, no. 1, pp. 11–16, Feb 1983.  → | |→ [8] A. van den Bos, “Complex gradient and Hessian,” IEE Pro- where ε εu + jεv 0 as h 0. The last identity follows by a straightforward application of the binomial theorem to the operators¡ ceedings - Vision, Image and Signal Processing, vol. 141, no. ∂ ∂ p ∂ ∗ ∂ p p 6, pp. 380–383, Dec 1994. (hx ∂x +hy ∂y ) and (h ∂z +h ∂z∗ ) . Also observe that n /p!= 1/[n!(p − n)!]. [9] F.D. Neeser and J.L. Massey, “Proper complex random pro- Proof of Theorem 2. For the first part, notice that all absolute mo-  cesses with applications to ,” IEEE Trans- ments of orders p ≤ p exist. Let h = hx + jhy.Now actions on Information Theory, vol. 39, no. 4, pp. 1293–1302, ¢ £ ΦZ (c + hx) − ΦZ (c) exp(jhxX) − 1 ∗ July 1993. =EZ exp(jRe(c Z)) hx hx [10] P.O. Amblard, M. Gaeta, and J.L. Lacoume, “Statistics for and complex variables and signals – Part I: Variables,” Signal Pro- ¢ £ ΦZ (c + jhy) − ΦZ (c) exp(jhyY ) − 1 ∗ cessing, vol. 53, no. 1, pp. 1–13, Aug 1996. =EZ exp(jRe(c Z)) . hy hy [11] J.M. Mendel, “Tutorial on higher-order statistics (spectra) in Since for all c ∈ C, hx ∈ R signal processing and system theory: theoretical results and some applications,” Proceedings of the IEEE, vol. 79, no. 3, exp(jhxx) − 1 ∗ exp(jRe(c Z)) ≤|x|≤|z|, pp. 278–305, Mar 1991. hx [12] W.P. Johnson, “The curious history of Faa´ di Bruno’s formula,” following from the result that | exp(jt) − 1|≤|t| for all t ∈ R, and American Mathematical Monthly, vol. 109, pp. 217–234, Mar similarly for hy ∈ R, it follows from Lebesgue dominated conver- 2002. gence that ¢ £ ∂ΦZ ∗ [13] B. Picinbono, “On circularity,” IEEE Transactions on Signal (c)=j EZ X exp(jRe(c Z)) Processing, vol. 42, no. 12, pp. 3473–3482, Dec 1994. ∂x and ¢ £ [14] K-T Fang, S. Kotz, and K.W. Ng, Symmetric Multivariate and ∂ΦZ ∗ (c)=j EZ Y exp(jRe(c Z)) . Related Distributions, Chapman and Hall, 1990. ∂y [15] J. Eriksson and V. Koivunen, “Complex random vectors Hence, by the definition of the complex partial derivatives   ¢ £ and ICA models: identifiability, uniqueness, and separability,” ∂ΦZ j ∗ ∗ (c)= EZ Z exp(jRe(c Z)) IEEE Transactions on Information Theory, vol. 52, no. 3, pp. ∂z 2 1017–1029, Mar 2006. and   ¢ £ [16] E. Ollila and V. Koivunen, “Generalized complex elliptical ∂ΦZ j ∗ ∗ (c)= EZ Z exp(jRe(c Z)) . distributions,” in Proc. of 2004 Workshop on Sensor Array and ∂z 2 Multichannel Processing, Sitges, Spain, July 2004, pp. 460– The general case follows by induction, and the claim about the mo- 464. ments with the substitution c =0. For the second part of the theorem, the existence of the p:th- order cpd implies the existence of the real partial derivatives of the A. PROOFS order p. Hence, the real moments αk(X) and αk(Y ) exist for k ≤ p if p is even and for k

3568