A Survey of Complex-Valued Neural Networks

A Survey of Complex-Valued Neural Networks Joshua Bassey, Xiangfang Li, Lijun Qian Center of Excellence in Research and Education for Big Military Data Intelligence (CREDIT) Department of Electrical and Computer Engineering Prairie View A&M University, Texas A&M University System Prairie View, TX 77446, USA Email: [email protected], [email protected], [email protected] Abstract—Artificial neural networks (ANNs) based machine valued model, as this offers a more constrained system than a learning models and especially deep learning models have been real-valued model [3]. widely applied in computer vision, signal processing, wireless Complex-valued neural networks (CVNN) are ANNs that communications, and many other domains, where complex numbers occur either naturally or by design. However, most of the process information using complex-valued parameters and current implementations of ANNs and machine learning frame- variables [4]. The main reason for their advocacy lies in works are using real numbers rather than complex numbers. the difference between the representation of the arithmetic of There are growing interests in building ANNs using complex complex numbers, especially the multiplication operation. In numbers, and exploring the potential advantages of the so called other words, multiplication function which results in a phase complex-valued neural networks (CVNNs) over their real-valued counterparts. In this paper, we discuss the recent development rotation and amplitude modulation yields an advantageous of CVNNs by performing a survey of the works on CVNNs in reduction of the degree of freedom [5]. The advantage of the literature. Specifically, detailed review of various CVNNs in ANNs is their self-organization and high degree of freedom in terms of activation function, learning and optimization, input learning. By knowing a priori about the amplitude and phase and output representations, and their applications in tasks such portion in data, a potentially dangerous portion of the freedom as signal processing and computer vision are provided, followed by a discussion on some pertinent challenges and future research can be minimized by using CVNNs. directions. Recently, CVNNs have received increased interests in signal Index Terms—complex-valued neural networks; complex num- processing and machine learning research communities. In ber; machine learning; deep learning this paper, we discuss the recent development of CVNNs by performing a survey of the works on CVNNs in the literature. I. INTRODUCTION The contributions of this paper include Artificial neural networks (ANNs) are data-driven comput- 1) A systematic review and categorization of the state- ing systems inspired by the dynamics and functionality of the of-the-art CVNNs has been carried out based on their human brain. With the advances in machine learning especially activation functions, learning and optimization methods, in deep learning, ANNs based deep learning models have gain input and output representations, and their applications tremendous usages in many domains and have been tightly in various tasks such as signal processing and computer fused into our daily lives. Applications such as automatic vision. speech recognition make it possible to have conversations 2) Detailed description of the different schools of thoughts, with computers, enable computers to generate speech and similarities and differences in approaches, and advan- musical notes with realistic sounds, and separate a mixture of tages and limitations of various CVNNs are provided. speech into single audio-streams for each speaker [1]. Other 3) Further discussions on some pertinent challenges and examples include object identification and tracking, personal- future research directions are given. arXiv:2101.12249v1 [stat.ML] 28 Jan 2021 ized recommendations, and automating important tasks more To the best of our knowledge, this is the first work solely efficiently [2]. dedicated to a comprehensive review of complex-valued neural In many of the practical applications, complex numbers are networks. often used such as in telecommunications, robotics, bioinfor- The rest of this paper is structured as follows. A background matics, image processing, sonar, radar, and speech recognition. on CVNNs, as well as their use cases are presented in This suggests that ANNs using complex numbers to represent Section II. Section III discusses CVNNs according to the type inputs, outputs, and parameters such as weights, have potential of activation functions used. Section IV reviews CVNNs based in these domains. For example, it has been shown that the on their learning and optimization approaches. The CVNNs phase spectra is able to encode fine-scale temporal depen- characterized by their input and output representations are dencies [1]. Furthermore, the real and imaginary parts of a reviewed in Section V. Various applications of CVNNs are complex number have some statistical correlation. By knowing presented in Section VI and challenges and potential research in advance the importance of phase and magnitude to our directions are discussed in Section VII. Section VIII contains learning objective, it makes more sense to adopt a complex- the concluding remarks. TABLE I dient is equivalent to obtaining the gradients of the real and SYMBOLS AND NOTATIONS imaginary components in part. C multivalued neural network (MVN) learning rate B. Why Complex-Valued Neural Networks C complex domain R real domain Artificial neural networks (ANNs) based machine learning d desired output e individual error of network output models and especially deep learning models have gained wide elog logarithmic error spread usage in recent years. However, most of the current E error of network output implementations of ANNs and machine learning frameworks f activation function i imaginary unity are using real numbers rather than complex numbers. There Im imaginary component are growing interests in building ANNs using complex num- j values of k-valued logic bers, and exploring the potential advantages of the so called J regularization cost function k output indices complex-valued neural networks (CVNNs) over their real- l indices of preceeding network layer valued counterparts. The first question is: why CVNNs are n indices for input samples needed? N total number of input samples o actual output (prediction) Although in many analyses involving complex numbers, p dimension of real values the individual components of the complex number have been Re real component treated independently as real numbers, it would be erroneous to m indices for output layer of MVN t regularization threshold parameter apply the same concept to CVNNs by assuming that a CVNN T target for MVN is equivalent to a two-dimensional real-valued neural network. u real part of activation function In fact, it has been shown that this is not the case [11], because v imaginary part of activation function x real part of weighted sum the operation of complex multiplication limits the degree y imaginary part of weighed sum of freedom of the CVNNs at the synaptic weighting. This Y output of MVN suggests that the phase-rotational dynamics strongly underpins δ partial differential ∆ total differential the process of learning. r gradientoperator From a biological perspective, the complex-valued repre- l(e) mean square loss function sentation has been used in a neural network [12]. The output l(elog) logarithmic loss function ∗ global error for MVN of a neuron was expressed as a function of its firing rate neuron error for MVN specified by its amplitude, and the comparative timing of ! error threshold for MVN its activity is represented by its phase. Exploiting complex- β^ regularized weights λ regularization parameter valued neurons resulted in more versatile representations. X all inputs With this formulation, input neurons with similar phases add w individual weight constructively and are termed synchronous, and asynchronous W all network weights z weighted sum neurons with dissimilar phases interfere with each other be- j · j modulo operation cause they add destructively. This is akin to the behavior k·k euclidean distance of the gating operation applied in deep feedforward neural \ angle networks [13], as well as in recurrent neural networks [14]. In the gating mechanism, the controlling gates are typically the The symbols and notations used in this review are summa- sigmoid-based activation, and synchronization describes the rized in Table I. propagation of inputs with simultaneously high values held by their controlling gates. This property of incorporating phase II. BACKGROUND information may be one of the reasons for the effectiveness of using complex-valued representations in recurrent neural A. Historical Notes networks. The ADALINE machine [6], a one-neuron, one-layer ma- The importance of phase is backed from the biological chine is one of the earliest implementations of a trainable perspective and also from a signal processing point of view. neural network influenced by the Rosenblatt perceptron [7]. Several studies have shown that the intelligibility of speech ADALINE used least mean square (LMS) and stochastic is affected largely by the information contained in the phase gradient descent for deriving optimal weights. portion of the audio signal [1], [15]. Similar results have also The LMS was first extended to the complex domain in [8], been shown for images. For example, it was shown in [16] where gradient descent was derived with respect to the real that by exploiting the

Load more