Learning Chaotic Attractors by Neural Networks

LETTER Communicated byJos e´ Principe Learning ChaoticAttractors by Neural Networks RembrandtBakker DelftChemTech,Delft University of Technology,2628 BL Delft,The Netherlands JaapC. Schouten ChemicalReactor Engineering, Eindhoven University of T echnology,5600 MB Eind- hoven,The Netherlands C. Lee Giles NEC ResearchInstitute, Princeton, NJ 08540,U.S.A. FlorisT akens Departmentof Mathematics, University of Groningen, 9700 AV Groningen,The Nether- lands Cor M.van den Bleek DelftChemTech,Delft University of Technology,5600 MB Eindhoven,The Netherlands An algorithmis introduced thattrains a neuralnetwork toidentifychaotic dynamicsfrom a singlemeasured time series. During training,the algo- rithmlearns to short-termpredict the time series. At the same time a criterion,developed by Diks,van Zwet,T akens,and deGoede (1996)is monitored thattests the hypothesis thatthe reconstructed attractors of model-generatedand measureddata are the same. T rainingis stopped when theprediction error is low and themodel passesthis test. Two other featuresof thealgorithm are (1) the way thestate of thesystem, consist- ing of delaysfrom the time series, has its dimension reducedby weighted principalcomponent analysisdata reduction, and (2)the user -adjustable predictionhorizon obtained by “errorpropagation” — partiallypropagat- ing predictionerrors to thenext time step. The algorithmis rstapplied to datafrom an experimental-driven chaoticpendulum, of which two of thethree state variables are known. This isa comprehensiveexample that shows how wellthe Diks test candistinguish between slightly dif ferentattractors. Second, thealgo- rithmis applied to thesame problem, but now one of thetwo known statevariables is ignored. Finally,wepresent a model forthe laser data fromthe Santa Fe time-series competition (set A). It is the rstmodel forthese data that is not only usefulfor short-term predictions but also generatestime series with similar chaotic characteristics as themeasured data. Neural Computation 12, 2355–2383 (2000) c 2000Massachusetts Institute of Technology ° 2356R. Bakker,J.C.Schouten,C. L.Giles, F.Takens, and C.M.van den Bleek 1Introduction Atimeseries measured froma deterministic chaoticsystem has the ap- pealing characteristic that its evolution isfully determined and yet its pre- dictability is limiteddue to exponential growth oferrors in model or measurements. Avariety ofdata-driven analysis methods forthis type oftime series was collectedin 1991 during the Santa Fe time-series competition (Weigend &Gershenfeld, 1994). The methods focused on either character- ization orprediction of the timeseries. Noattention was given to athird, and much moreappealing, objective: given the data and the assumption that itwas producedby adeterministic chaoticsystem, nd aset ofmodel equations that willproduce a timeseries with identical chaoticcharacter- istics, having the same chaoticattractor. The modelcould be based onrst principlesif the system is well understood, but here we assume knowl- edge ofjust the timeseries and use aneural network–based, black-box model.The choicefor neural networks isinspiredby the inherent stability oftheir iterated predictions,as opposedto forexample, polynomialmodels (Aguirre& Billings, 1994) and locallinear models(Farmer & Sidorowich, 1987). Lapedes and Farber (1987) were among the rst who tried the neuralnetwork approach.In conciseneural network jargon, we formulate our goal: train anetwork to learn the chaoticattractor .Anumber ofauthors have addressed this issue (Aguirre& Billings, 1994; Principe,Rathie, & Kuo, 1992; Kuo& Principe,1994; Deco& Schurmann,¨ 1994; Rico-Mart´õ nez, Krischer,Kevrekidis, Kube, &Hudson, 1992; Krischeret al., 1993; Albano, Passamente, Hediger, &Farrell,1992). The commonapproach consists of two steps: 1. Identify amodelthat makes accurate short-term predictions. 2. Generate along timeseries with the modelby iterated prediction,and comparethe nonlinear-dynamic characteristics ofthe generated time series with the original,measured timeseries. Principeet al. (1992) found that inmany cases, this approachfails; the modelcan make goodshort-term predictionsbut has not learned the chaotic attractor. The method would be greatly improvedif we couldminimize directlythe difference between the reconstructed attractors ofthe model- generated and measured data rather than minimizingprediction errors. However, we cannot reconstruct the attractor without rst having apredic- tion model.Therefore, research isfocused on how to optimizeboth steps. Wehighly reduce the chance offailure by integrating step 2intostep 1, the modelidenti cation. Rather than evaluating the modelattractor after training, we monitorthe attractor during training, and introduce anew test developed by Diks, van Zwet, Takens, and de Goede (1996) as astoppingcri- terion. It tests the null hypothesis that the reconstructed attractors ofmodel- generated and measured data are the same. The criteriondirectly measures Learning ChaoticAttractors 2357 the distance between two attractors, and agoodestimate ofits variance isavailable. Wecombinedthe stoppingcriterion with two specialfeatures that we found very useful: (1) an efcient state representation by weighted principalcomponent analysis (PCA) and (2) aparameter estimation scheme based on amixtureof the output-errorand equation-error method, previ- ously introduced as the compromisemethod (Werbos, McAvoy,&Su, 1992). WhileW erbos et al. promotedthe method to be used where equation error fails, we here use itto make the predictionhorizon user adjustable. The method partially propagates errorsto the next timestep, controlledby a user-specied errorpropagation parameter . In this articlewe present three successful applicationsof the algorithm. First, aneural network istrained ondata froman experimental driven and damped pendulum. This system isknown to have three state variables, of which one ismeasured and asecond, the phase ofthe sinusoidal driving force,is known beforehand. This experimentis used as acomprehensive visualisation ofhow well the Diks test can distinguish between slightly different attractors and how its performancedepends on the number of data points. Second, amodelis trained onthe same pendulum data, but this timethe informationabout the phase ofthe drivingforce is completely ignored. Instead, embedding with delays and PCA is used. This experiment isapracticalveri cation ofthe Takens theorem (Takens, 1981), as explained insection 3. Finally,the errorpropagation feature ofthe algorithmbecomes importantin the modelingof the laser data (set A)from the 1991 Santa Fe time-series predictioncompetition. The resulting neural network model opens new possibilities foranalyzing these data because itcan generate timeseries up to any desired length and the Jacobian ofthe modelcan be computedanalytically at any time, which makes itpossible to computethe Lyapunov spectrumand nd periodicsolutions ofthe model. Section2 gives abrief descriptionof the two data sets that are used to explainconcepts ofthe algorithmthroughout the article.In section 3, the input-output structure ofthe modelis dened: the state is extracted from the single measured timeseries using the method ofdelays followedby weighted PCA, and predictionsare made by acombinedlinear and neural network model.The errorpropagation training is outlined in section 4, and the Diks test, used to detect whether the attractor ofthe modelhas converged to the measured one, insection 5. Section6 contains the results ofthe two different pendulum models, and section 7coversthe laser data model.Section 8 concludes with remarks on attractor learning and future directions. 2 Data Sets Two data sets are used to illustrate features ofthe algorithm:data froman experimental driven pendulum and far-infrared laser data fromthe 1991 Santa Fe time-series competition(set A).The pendulum data are described 2358R. Bakker,J.C.Schouten,C. L.Giles, F.Takens, and C.M.van den Bleek Figure 1:Plots of the rst 2000samples of(a) driven pendulum and (b)Santa Fe laser time series. insection 2.1. Forthe laser data we referto the extensive descriptionof Hubner,¨ Weiss, Abraham, &Tang (1994). The rst 2048 points ofthe each set are plotted inFigures 1a and 1b. 2.1Pendulum Data. The pendulum we use isatype EM-50 pendulum producedby Daedalon Corporation(see Blackburn, Vik, &Binruo, 1989, for details and Figure 2fora schematic drawing). Ideally,the pendulum would Figure 2:Schematic drawing ofthe experimental pendulum and its driving force.The pendulum arm canrotate around its axis;the angle h is measured. During the measurements, the frequency,ofthe driving torque was0.85 Hz. Learning ChaoticAttractors 2359 obey the followingequations ofmotion, in dimensionless form, h v d v c v sinh a sin w , (2.1) dt 0 1 D 0¡ ¡ C 1 w vD @ A @ A where h is the angle ofthe pendulum, v its angular velocity, c is a damping constant, and (a, vD, w ) are the amplitude, frequency,and phase, respec- tively,ofthe harmonicdriving torque. Asobserved by De Korte,Schouten, &van den Bleek (1995), the real pendulum deviates fromthe ideal behavior ofequation 2.1 because ofits fourelectromagnetic drivingcoils that repulse the pendulum at certain positions(top, bottom, left, right). However, from acomparisonof Poincar e´ plotsof the ideal and real pendulum by Bakker, de Korte,Schouten, Takens, &van den Bleek (1996), itappears that the real pendulum can be described by the same three state variables (h, v, w ) as the equations 2.1. In oursetup the angle h ofthe pendulum is measured at an accuracy ofabout 0.1 degree.

Load more