Initialisation and Training Procedures for Wavelet Networks Applied to Chaotic Time Series

Eng Int Syst (2010) 1: 1–9 © 2010 CRL Publishing Ltd Engineering Intelligent Systems Initialisation and training procedures for wavelet networks applied to chaotic time series V. Alarcon-Aquino1, O. Starostenko1, J. M. Ramirez-Cortes2, P. Gomez-Gil3, E. S. Garcia-Treviño1 1Communications and Signal Processing Research Group, Department of Computing, Electronics, and Mechatronics, Universidad de las Americas Puebla, Sta. Catarina Mártir. Cholula, Puebla. 72820. MEXICO. E-mail: [email protected] 2Department of Electronics Engineering 3Department of Computer Science, National Institute for Astrophysics, Optics, and Electronics, Tonantzintla, Puebla, MEXICO Wavelet networks are a class of neural network that take advantage of good localization properties of multi-resolution analysis and combine them with the approximation abilities of neural networks. This kind of networks uses wavelets as activation functions in the hidden layer and a type of back-propagation algorithm is used for its learning. However, the training procedure used for wavelet networks is based on the idea of continuous differentiable wavelets and some of the most powerful and used wavelets do not satisfy this property. In this paper we report an algorithm for initialising and training wavelet networks applied to the approximation of chaotic time series. The proposed algorithm which has its foundations on correlation analysis of signals allows the use of different types of wavelets, namely, Daubechies, Coiflets, and Symmlets. To show this, comparisons are made for chaotic time series approximation between the proposed approach and the typical wavelet network. Keywords: Wavelet networks, wavelets, approximation theory, multi-resolution analysis, chaotic time series. 1. INTRODUCTION Recently, several studies have been looking for better ways to design neural networks. For this purpose they have anal- Wavelet neural networks are a novel powerful class of neural ysed the relationship between neural networks, approximation networks that incorporate the most important advantages of theory, and functional analysis. In functional analysis any multi-resolution analysis (MRA) [3, 4, and 7]. Several authors continuous function can be represented as a weighted sum have found a link between the wavelet decomposition theory of orthogonal basis functions. Such expansions can be eas- and neural networks (see e.g., [1–3, 5, 6, 9, 13, and 21]). They ily represented as neural networks which can be designed for combine the good localisation properties of wavelets with the the desired error rate using the properties of orthonormal ex- approximation abilities of neural networks. This kind of net- pansions [5]. Unfortunately, most orthogonal functions are works uses wavelets as activation functions in the hidden layer global approximators, and suffer from the disadvantage men- and a type of back-propagation algorithm is used for its learn- tioned above. In order to take full advantage of orthonormal- ing. These networks preserve all the features of common ity of basis functions, and localised learning, we need a set neural networks, like universal approximation properties, but of basis functions which are local and orthogonal. Wavelets in addition, present an explicit link between the network co- are functions with these features. In wavelet theory we can efficients and some appropriate transform. build simple orthonormal bases with good localisation prop- vol 1 no 1 March 2010 1 INITIALISATION AND TRAINING PROCEDURES FOR WAVELET NETWORKS APPLIED TO CHAOTIC TIME SERIES erties. Wavelets are a new family of basis functions that com- tions in pattern classification, data mining, signal reconstruc- bine powerful properties such as orthogonality, compact sup- tion, and system identification [3]. port, localisation in time and frequency, and fast algorithms. The remainder of this paper is organised as follows. Sec- Wavelets have generated a tremendous interest in both theo- tion 2 briefly reviews wavelet theory. Section 3 describes the retical and applied areas, especially over the past few years typical wavelet network structure. In Section 4, the proposed [3, 4]. Wavelet-based methods have been used for approxi- initialisation and training procedures are reported. In Sec- mation theory, pattern recognition, compression, time-series tion 5 comparisons are made and discussed between the typi- prediction, numerical analysis, computer science, electrical cal wavelet network and the proposed approach using chaotic engineering, and physics (see e.g., [3, 5, 10–13, 20–25]). time series. Finally, Section 6 presents the conclusions of this Wavelet networks are a class of neural networks that employ work. wavelets as activation functions. These have been recently in- vestigated as an alternative approach to the traditional neural 2. REVIEW OF WAVELET THEORY networks with sigmoidal activation functions. Wavelet networks have attracted great interest, because of their advan- Wavelet transforms involve representing a general function in tages over other networks schemes (see e.g., [6, 9–13, 16, 17, terms of simple, fixed building blocks at different scales and 20–24]). In [20, 22] the authors use wavelet decomposition positions. These building blocks are generated from a single and separate neural networks to capture the features of the fixed function called mother wavelet by translation and dila- analysed time series. In [13] a wavelet network control is tion operations. The continuous wavelet transform considers proposed to online learn and cancel repetitive errors in disk a family drives. In [21, 23] an adaptive wavelet network control is pro- 1 x − b posed to online structure adjustment and parameter updating ψa,b(x) = √ ψ (1) |a| a applied to a class of nonlinear dynamic systems with a par- + tially known structure. The latter approaches are based on the where a ∈ ,b ∈,with a = 0, and ψ (·) satisfies the work of Zhang and Benveniste [1] that introduces a (1 + 1/2) admissibility condition [7]. For discrete wavelets the scale layer neural network based on wavelets. The basic idea is to (or dilation) and translation parameters in Eq. (1) are chosen m −m m replace the neurons by more powerful computing units ob- such that at level m the wavelet a0 ψ(a0 x) is a0 times the { = m : ∈ } tained by cascading wavelet transform. The wavelet network width of ψ (x). That is, the scale parameter a a0 m Z { = m : ∈ } learning is performed by the standard back-propagation type and the translation parameter b kb0a0 m, k Z . This algorithm as the traditional feed-forward neural network. It family of wavelets is thus given by was proven that a wavelet network is a universal approxima- = −m/2 −m − tor that can approximate any functions with arbitrary preci- ψm,k(x) a0 ψ(a0 x kb0) (2) sion by a linear combination of father and mother wavelets [1, so the discrete version of wavelet transform is 4]. The main purpose of the work reported in this paper is d = f (x) ,ψ (x) m,k m,k twofold: first, to modify the wavelet network training and ini- +∞ tialisation procedures to allow the use of all types of wavelets; = −m/2 −m − a0 f (x) ψ a0 x kb0 dx (3) and second, to improve the wavelet network performance −∞ working with these two new procedures. The most important ·, · denotes the L2-inner product. To recover f(x)from the difficulty to make this is that typical wavelet networks [1–3, { } 23] use a gradient descent algorithm for its training. Gradient coefficients dm,k , the following stability condition should descent methods require a continuous differentiable wavelet exist [4, 7] (respect to its dilation and translation parameters) and some 2 A f (x)2 ≤ f (x) ,ψ (x) ≤ B f (x)2, of the most powerful and used wavelets are not analytically m,k m∈Z k∈Z differentiable [4, 6, 7]. (4) As a result, we have to seek for alternative methods to ini- with A>0 and B<∞ for all signals f(x)in L2() denoting tialise and to train the network. That is, a method that makes the frame bounds. These frame bounds can be computed from possible to work with different types of wavelets, with differ- a0, b0 and ψ(x)[7]. The reconstruction formula is thus given ent support, differentiable and not differentiable, and orthog- by onal and non-orthogonal. In the work reported in this paper we propose a new training algorithm based on concepts of ∼ 2 f (x) = f (x) ,ψm,k (x) ψm,k (x), (5) direct minimisation techniques, wavelet dilations, and linear A + B m∈Z k∈Z combination theory. The proposed initialisation method has its foundations on correlation analysis of signals, and there- Note that the closer A and B, the more accurate the recon- fore a denser adaptable grid is introduced. The term adaptable struction. When A = B = 1, the family of wavelets then is used because the proposed initialisation grid depends upon forms an orthonormal basis [7]. The mother wavelet function the effective support of the wavelet. Particularly, we present ψ(x), scaling a0 and translation b0 parameters are specifi- wavelet networks applied to the approximation of chaotic time cally chosen such that ψm,k(x) constitute orthonormal bases series. Function approximation involves estimating the under- for L2() [4, 7]. To form orthonormal bases with good time- lying relationship from a given finite input-output data set, and frequency localisation properties, the time-scale parameters it has been the fundamental problem for a variety of applica- (b, a) are sampled on a so-called dyadic grid in the time-scale 2 Engineering Intelligent Systems V. ALARCON-AQUINO ET AL plane, namely, a0 = 2 and b0 = 1, [4, 7] Thus, substitut- ing these values in Eq. (2), we have a family of orthonormal bases, −m/2 −m ψm,k(x) = 2 ψ(2 x − k) (6) Using Eq. (3), the orthonormal wavelet transform is thus given by d = f (x) ,ψ (x) m,k m,k +∞ −m/2 −m = 2 f (x) ψm,k 2 x − k dx (7) −∞ and the reconstruction formula is obtained from Eq.

Load more