Machine Learning for Analysis of High-Dimensional Chaotic Spatiotemporal Dynamical Systems
Total Page:16
File Type:pdf, Size:1020Kb
MACHINE LEARNING FOR ANALYSIS OF HIGH-DIMENSIONAL CHAOTIC SPATIOTEMPORAL DYNAMICAL SYSTEMS 6th December 2018 Princeton Plasma Physics Laboratory Theory Seminar Jaideep Pathak Zhixin Lu, Alex Wikner, Rebeckah Fussell, Brian Hunt, Michelle Girvan, Edward Ott University of Maryland, College Park Given some data from a dynamical system, what can we say about the dynamical process that generated this data? !2 OUTLINE Short-term forecasting Prediction Given a limited time series of past measurements can we predict the future state of the dynamical system, at least in the short term? Reconstructing the attractor Long-term Can we learn something about the long-term dynamics of the dynamics system? Specifically, can we use our setup to understand the ergodic properties of a high-dimensional dynamical system? Scalability Can we use our setup for very high-dimensional attractors? Large systems !3 INTRODUCTION Operationally, we say a x1(t) − x2(t) = δx(t) dynamical system is chaotic if, in a ∥δx(t)∥ ∼ ∥δx(0)∥ exp(Λt) bounded phase space, two nearby trajectories diverge exponentially. !4 INTRODUCTION In a chaotic system, a perturbation to the trajectory may be stable (contracting) in some directions, unstable (expanding) in some directions. !5 INTRODUCTION This leads to the concept of a spectrum of ‘Lyapunov exponents’. The exponential rates of growth (or contraction) of perturbations in different directions are characterized by corresponding Lyapunov exponents. !6 INTRODUCTION The Lyapunov exponents of a dynamical system are an important characteristic and provide us with a lot of useful information. What is the ‘complexity’? Is a system chaotic? of the system What is the effective dimension of the chaotic What is the possible attractor of the system prediction horizon? (via the ‘Kaplan-Yorke conjecture’)? !7 INTRODUCTION In general, in the past it has proven difficult (and sometimes not possible) to estimate the spectrum of Lyapunov exponents of a dynamical system purely from limited time series data. !8 OUTLINE Prediction Given a limited time series of past measurements can we predict the future state of the dynamical system, at least in the short term? Reconstructing the attractor Can we learn something about the long-term dynamics of the system? Specifically, can we use our setup to find the Lyapunov exponents of a high-dimensional dynamical system? Scalability Can we use our setup for very high-dimensional attractors? !9 MACHINE LEARNING TECHNIQUE: RESERVOIR COMPUTING Provides a way to train Recurrent Neural Networks. Introduced by Jaeger (2001), and Maass et al. (2002). Can be interpreted as a very high-dimensional system, called the reservoir (not to be confused with the measured dynamical system) which provides a rich repository of dynamics. !10 RESERVOIR COMPUTING The reservoir in our setup is a network of Dr nodes. Each node i has multiple inputs and outputs and a scalar state denoted by ri(t). The weighted connections between the nodes can be represented by an adjacency matrix A. !11 RESERVOIR COMPUTING The reservoir in our setup is a network of Dr nodes. Each node i has multiple inputs and outputs and a scalar state denoted by ri(t). The weighted connections between the nodes can be represented by an adjacency matrix A. • Sparse. • Randomly Generated. • Fixed. !12 RESERVOIR NEURAL NETWORK IMPLEMENTATION FEED DATA An input is coupled to the reservoir r(t + Δt) network through a fixed, randomly Win generated input matrix. u(t) r(t + Δt) = tanh (Ar(t) + Winu(t)) −T ≤ t ≤ 0 !13 RESERVOIR NEURAL NETWORK IMPLEMENTATION FEED DATA An input is coupled to the reservoir r(t + Δt) network through a fixed, randomly Win Wout generated input matrix. ~x(t) u(t) v(t + Δt) r(t + Δt) = tanh (Ar(t) + Winu(t)) −T ≤ t ≤ 0 LINEAR FIT Find the output weight matrix that minimizes the following loss function 0 2 ℒ(Wout) = ∑ ∥Woutr(t) − u(t)∥ t=−T v(t) = Woutr(t) v(t) ≃ u(t) !14 RESERVOIR NEURAL NETWORKTRAINING IS A SIMPLE LINEAR IMPLEMENTATION REGRESSION PROBLEM. FEED DATA An input is coupled to the reservoir r(t) INSTEAD OF GRADIENT network through a fixed, randomly Win DESCENT, WE HAVE A generated input matrix. SIMPLER PROBLEM OF MATRIXu(t) INVERSION. r(t + Δt) = tanh (Ar(t) + Winu(t)) −T ≤ t ≤ 0 LINEAR FIT Find the output weight matrix that minimizes the following loss function 0 2 ℒ(Wout) = ∑ ∥Woutr(t) − u(t)∥ t=−T v(t) = Woutr(t) v(t) ≃ u(t) !15 RESERVOIR NEURAL NETWORK IMPLEMENTATION FEED DATA An input is coupled to the reservoir r(t + Δt) network through a fixed, randomly Win Wout generated input matrix. ~x(t) u(t) v(t + Δt) r(t + Δt) = tanh (Ar(t) + Winu(t)) −T ≤ t ≤ 0 LINEAR FIT Find the output weight matrix that minimizes the following loss function 0 2 ℒ(Wout) = ∑ ∥Woutr(t) − u(t)∥ t=−T v(t) = Woutr(t) v(t) ≃ u(t) !16 RESERVOIR NEURAL NETWORK IMPLEMENTATION FEED DATA An input is coupled to the reservoir r(t + Δt) network through a fixed, randomly Win Wout generated input matrix. u(t) v(t + Δt) ≃ u(t + Δt) r(t + Δt) = tanh (Ar(t) + Winu(t)) −T ≤ t ≤ 0 LINEAR FIT Find the output weight matrix that PREDICTION minimizes the following loss function t > 0 0 2 v(t) = Woutr(t) ℒ(Wout) = ∑ ∥Woutr(t) − u(t)∥ t=−T r(t + Δt) = tanh (Ar(t) + Winv(t)) v(t) = Woutr(t) v(t) ≃ u(t) !17 THE KURAMOTO-SIVASHINSKY (KS) SYSTEM A nonlinear, spatiotemporal chaotic PDE yt = − yyx − yxx − yxxxx x ∈ [0,L) y(x + L) = y(x) L = 100 0 2 L/2 0 -2 L 05 10 15 20 25 30 35 40 !18 SHORT-TERM FORECASTING OF CHAOS: KURAMOTO-SIVASHINSKY EQUATION TRUE STATE L = 60 0 L/2 L RESERVOIR PREDICTION 0 2 ) t , L/2 0 x ( -2 y L DIFFERENCE 0 L/2 L 01015205 !19 SHORT-TERM FORECASTING OF CHAOS: KURAMOTO-SIVASHINSKY EQUATION TRUE STATE L = 60 0 L/2 L RESERVOIR PREDICTION 0 2 ) t , L/2 0 x ( -2 y L DIFFERENCE 0 WE OBTAIN GOOD PREDICTION L/2 QUALITY FOR ABOUT 5 MULTIPLES OF THE L LYAPUNOV TIME 01015205 !20 SHORT-TERM FORECASTING OF CHAOS: KURAMOTO-SIVASHINSKY EQUATION The reservoir computer is really good at learning the dynamics from data alone. It is capable of making high quality short term predictions when the system dynamics is unknown. !21 NOTICE THAT THE DYNAMICS LOOKS ‘KS LIKE’ EVEN WHEN THE PREDICTION HAS DIVERGED FROM THE TRUE STATE TRUE STATE L = 60 0 L/2 L RESERVOIR PREDICTION 0 2 ) t , L/2 0 x ( -2 y L DIFFERENCE 0 L/2 L 01015205 !22 ‘CLIMATE’ OF THE RESERVOIR DYNAMICS Has the reservoir truly learned the dynamical behavior of the Kuramoto-Sivashinsky system? If it has, the ergodic properties of the autonomous reservoir dynamical system should resemble those of the true system. Is there a way to verify this? !23 OUTLINE Prediction Given a limited time series of past measurements can we predict the future state of the dynamical system, at least in the short term? Reconstructing the attractor Can we learn something about the long-term dynamics of the system? Specifically, can we use our setup to find the Lyapunov exponents of a high-dimensional dynamical system? Scalability Can we use our setup for very high-dimensional attractors? !24 LYAPUNOV EXPONENTS FROM DATA We know the evolution equation of the post-training autonomous reservoir system. r(t + Δt) = F [r(t)] !25 LYAPUNOV EXPONENTS FROM DATA We know the evolution equation of the post-training autonomous reservoir system. r(t + Δt) = F [r(t)] r(t + Δt) = tanh (Ar(t) + WinWoutr(t)) !26 LYAPUNOV EXPONENTS FROM DATA We know the evolution equation of the post-training autonomous reservoir system. r(t + Δt) = F [r(t)] r(t + Δt) = tanh (Ar(t) + WinWoutr(t)) We can calculate the equation of the tangent map evolution. δr(t + Δt) = DF [r(t)] δr(t) !27 LYAPUNOV EXPONENTS FROM DATA We know the evolution equation of the post-training autonomous reservoir system. r(t + Δt) = F [r(t)] r(t + Δt) = tanh (Ar(t) + WinWoutr(t)) We can calculate the equation of the tangent map evolution. δr(t + Δt) = DF [r(t)] δr(t) Use these to compute the Lyapunov exponents of the reservoir dynamical system. !28 LYAPUNOV EXPONENTS FROM DATA We know the evolution equation of the post-training autonomous reservoir system. r(t + Δt) = F [r(t)] r(t + Δt) = tanh (Ar(t) + WinWoutr(t)) We can calculate the equation of the tangent map evolution. δr(t + Δt) = DF [r(t)] δr(t) Are the Lyapunov Use these to compute the Lyapunov exponents of the exponents of the reservoir dynamical reservoir same as system. those of the data generating system? !29 LYAPUNOV EXPONENTS FROM DATA The Lyapunov exponents of the KS system are indeed 0.2 accurately reproduced by Red: True KS system Blue: Reservoir the reservoir! 0 -0.2 -0.4 -0.6 0 10 20 30 !30 LYAPUNOV EXPONENTS FROM DATA The Lyapunov exponents of the KS system are indeed 0.2 accurately reproduced by Red: True KS system Blue: Reservoir the reservoir! 0 -0.2 -0.4 The reservoir fails at obtaining -0.6 symmetry-related zero Lyapunov exponents. This aspect is 0 10 20 30 discussed further in our paper. !31 OUTLINE Prediction Given a limited time series of past measurements can we predict the future state of the dynamical system, at least in the short term? Reconstructing the attractor Can we learn something about the long-term dynamics of the system? Specifically, can we use our setup to find the Lyapunov exponents of a high-dimensional dynamical system? Scalability Can we use our setup for very high-dimensional attractors? !32 VERY HIGH-DIMENSIONAL DYNAMICAL SYSTEMS Can machine learning be useful for studying very high dimensional dynamical systems? Interesting dynamical systems such as atmospheric general circulation models are very high- dimensional.