Research Article Complex-Valued Adaptive Signal Processing Using Nonlinear Functions
Total Page:16
File Type:pdf, Size:1020Kb
Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 2008, Article ID 765615, 9 pages doi:10.1155/2008/765615 Research Article Complex-Valued Adaptive Signal Processing Using Nonlinear Functions Hualiang Li and Tulay¨ Adalı Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, MD 21250, USA Correspondence should be addressed to Tulay¨ Adalı, [email protected] Received 16 October 2007; Accepted 14 February 2008 Recommended by An´ıbal Figueiras-Vidal We describe a framework based on Wirtinger calculus for adaptive signal processing that enables efficient derivation of algorithms by directly working in the complex domain and taking full advantage of the power of complex-domain nonlinear processing. We establish the basic relationships for optimization in the complex domain and the real-domain equivalences for first- and second- order derivatives by extending the work of Brandwood and van den Bos. Examples in the derivation of first- and second-order update rules are given to demonstrate the versatility of the approach. Copyright © 2008 H. Li and T. Adalı. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. INTRODUCTION back to the complex domain. The approach facilitates the computations but increases the dimensionality of the prob- Most of today’s challenging signal processing applications re- lem and might not be practical for functions that are nonlin- quire techniques that are nonlinear, adaptive, and with on- ear since in this case, the functional form might not be easily line processing capability. Also, there is need for approaches separable to real and imaginary parts. to process complex-valued data as such data arises in a good Another issue that arises in the nonlinear processing number of scenarios, for example, when processing radar of complex-valued data is due to the conflict between the and magnetic resonance data as well as communications data boundedness and differentiability of complex functions. This and when working in a transform domain such as frequency. result is stated by Liouville’s theorem as: aboundedentire Even though complex signals play such an important role, function must be a constant in the complex domain [1]. Hence, many engineering shortcuts have typically been taken in their to use a flexible nonlinear model such as the nonlinear re- treatment preventing full utilization of the power of complex gression model, one cannot identify a complex nonlinear domain processing as well as the information in the real and function (C → C) that is bounded everywhere on the entire imaginary parts of the signal. complex domain. A practical solution to satisfy the bound- The main difficulty arises due to the fact that in the com- edness requirement has been to process the real and imagi- plex domain, analyticity, that is, differentiability in a given nary parts (or the magnitude and phase) separately through open set, as described by the Cauchy-Riemann equations [1] bounded real-valued nonlinearities (see, e.g., [2–6]). The so- imposes a strong structure on the function itself. Thus the lution provides reasonable approximation ability but is an ad analyticity condition is not satisfied for many functions of hoc solution not fully exploiting the efficiency of complex practical interest, most notably for the cost (objective) func- representations, both in terms of parameterization (number tions used as these are typically real valued and hence nonan- of parameters to estimate) and in terms of learning algo- alytic in the complex domain. Definition of pseudogradients rithms to estimate the parameters as we cannot define true are used—and still not through a consistent definition in the gradients when working with these functions. literature—and when having to deal with vector gradients, In this paper, we define a framework that allows tak- transformations CN → R2N are commonly used. These trans- ing full advantage of the power of complex-valued process- formations are isomorphic and allow the use of real-valued ing, in particular when working with nonlinear functions, calculus in the computations, which includes well-defined and eliminates the need for either of the two common engi- gradient and Hessians that can be at the end transformed neering practices we mentioned. The framework we develop 2 EURASIP Journal on Advances in Signal Processing is based on Wirtinger calculus [7] and extends the work of where z = x + jy, is given by the Cauchy-Riemann equations Brandwood [8]andvandenBos[9] to define the basic for- [1]: mulations for derivation of algorithms and their analyses in ∂u ∂v ∂v ∂u the complex domain. We show how the framework also nat- = , =− ,(2) urally admits the use of nonlinear functions that are analytic ∂x ∂y ∂x ∂y rather than the pseudocomplex nonlinear functions defined which summarize the conditions for the derivative to as- using real-valued nonlinearities. Analytic complex nonlinear sume the same value regardless of the direction of approach functions have been shown to provide efficient representa- when Δz → 0.These conditions, when considered carefully, tions in the complex plane [10, 11] and to be universal ap- make it clear that the definition of complex differentiability proximators when used as activation functions in a single- is quite stringent and imposes a strong structure on u(x, y) layer multilayer perceptron (MLP) network [12]. and v(x, y), the real and imaginary parts of the function, The work by Brandwood [8]andvandenBos[9]em- and consequently on f (z). Also, obviously most cost (objec- phasize the importance of working with complex-valued gra- tive) functions do not satisfy the Cauchy-Riemann equations dient and Hessian operators rather than transforming the as these functions are typically f : C → R and thus have problem to the real domain. Both contributions, though not v(x, y) = 0. acknowledged in either of the papers, make use of Wirtinger An elegant approach due to Wirtinger [7] relaxes this calculus [7] that provides an elegant way to bypass the limita- strong requirement for differentiability, and defines a less tion imposed by the strict definition of differentiability in the stringent form for the complex domain. More importantly, complex domain. Wirtinger calculus relaxes the traditional it describes how this new definition can be used for defin- definition of differentiability in the complex domain—which ing complex differential operators that allow computation of we refer to as complex differentiability—by defining a form derivatives in a very straightforward manner in the complex that is much easier to satisfy and includes almost all functions domain, by simply using real differentiation results and pro- of practical interest, including functions that are CN → R. cedures. The attractiveness of the formulation stems from the fact In the development, the commonly used definition of that though the derivatives defined within the framework do differentiability that leads to the Cauchy-Riemann equations not satisfy the Cauchy-Riemann conditions, they obey all the is identified as complex differentiability and functions that rules of calculus, including the chain rule, differentiation of satisfy the condition on a specified open set as complex an- products and quotients. Thus all computations in the deriva- alytic (or complex holomorphic). The more flexible form tion of an algorithm can be carried out as in the real case. We of differentiability is identified as real differentiability,anda provide the connections between the gradient and Hessian function is called real differentiable when u(x, y)andv(x, y) formulations given in [9] described in C2N and R2N to the are differentiable as functions of real-valued variables x and complex CN -dimensional space, and establish the basic rela- y. Then, one can write the two real-variables as x = (z+z∗)/2 tionships for optimization in the complex domain including and y =−j(z − z∗)/2, and use the chain rule to derive first- and second-order Taylor-series expansions. the operators for differentiation given in the theorem below. Three specific examples are given to demonstrate the ap- The key point in the derivation is regarding the two vari- plication of the framework to complex-valued adaptive sig- ables z and z∗ as independent from each other, which is also nal processing, and to show how they enable the use of the the main trick that allows us to make use of the elegance true processing power of the complex domain. The examples of Wirtinger calculus. Hence, we consider a given function include a multilayer perceptron filter design and the deriva- f : C → C as f : R × R → C by writing it as f (z) = f (x, y), tion of the gradient update (backpropagation) rule, indepen- and make use of the underlying R2 structure. The main result dent component analysis using maximum likelihood, and the in this context is stated by Brandwood as follows [8]. derivation of an efficient second-order learning rule, the con- jugate gradient algorithm for the complex domain. Theorem 1. Let f : R × R → C be a function of real variables Next section introduces the main tool, Wirtinger calculus x and y such that g(z, z∗) = f (x, y),wherez = x + jy and for optimization in the complex domain and the key results that g is analytic with respect to z∗ and z independently. Then, given in [8, 9], which we use to establish the main theory (i) the partial derivatives presented in Section 3.InSection 3, we consider both vector and matrix optimization and establish the equivalences for ∂g 1 ∂f ∂f ∂g 1 ∂f ∂f = − j , = + j first- and second-order derivatives for the real and complex ∂z 2 ∂x ∂y ∂z∗ 2 ∂x ∂y case, and provide the fundamental results for CN and CN×M.