Extreme events in excitable systems

A machine learning approach to chaotic dynamics

Caroline Bükk, Oscar Johansson, Alexander Jonsson Bachelors thesis Chalmers University of Technology Department of Physics 2019–05–16

Supervisor: Evangelos Siminos, University of Gothenburg Abstract

Recent developments in the field of dimensionality reduction have provided new tools for analysing objects that live in high-dimensional spaces but have low intrinsic dimensionality. In this thesis we have studied how different dimensionality reduction techniques in combination with Poincaré sections can be used to generate one-dimensional return maps that capture the topology of strange in dissipative chaotic dynamical systems. This method was evaluated on the so called FitzHugh-Nagumo system to see if it aids in predictions of extreme events. However as this is a new method for studying chaotic dynamics, it was first evaluated on the Rössler and , which do not exhibit extreme events. For these systems two approaches were attempted. The first was to apply dimensionality reduction on the and then construct a return map. The second approach was to first choose a Poincaré section for the attractor, apply dimensionality reduction and create a return map. The initial studies of the Lorenz and Rössler attractor generated reliable single-valued return maps using both approaches for most of the dimensionality reduction techniques used. However as the FitzHugh-Nagumo system required a substantially larger data set the first approach was inefficient. The second approach, applying manifold learning on the points on the Poincaré section, generated no significant difference from ordinary Poincaré maps. Both approaches therefore proved to be insufficient for predicting extreme events in the FitzHugh-Nagumo system. However a new approach, that might be of interest in further studies, was proposed.

Keywords: Extreme events, manifold learning, chaotic dynamical systems, Rössler attractor, Lorenz attractor, FitzHugh-Nagumo .

Acknowledgments

We would like to thank our supervisor, Evangelos Siminos, for taking the time to help us throughout this project and giving us great advice. It would not have been the same without him. Contents

1 Introduction 1 2 Theory 2 2.1 Chaotic dynamical systems ...... 2 2.1.1 Strange attractors ...... 2 2.1.2 Extreme events and excitability ...... 4 2.1.3 Poincaré return maps ...... 4 2.2 FitzHugh-Nagumo ...... 6 2.3 Dimensionality reduction ...... 6 2.3.1 Linear dimensionality reduction ...... 7 2.3.1.1 Principal component analysis ...... 7 2.3.2 Manifold learning ...... 7 2.3.2.1 Isomap ...... 8 2.3.2.2 Locally linear embedding ...... 9 2.3.2.3 Modified locally linear embedding ...... 10 2.3.2.4 Hessian eigenmapping ...... 10 2.3.2.5 Local Tangent Space Alignment ...... 11 2.3.2.6 t-distributed Stochastic Neighbour Embedding ...... 11 3 Approach and implementation 13 3.1 Approach ...... 13 3.1.1 Methodology ...... 13 3.1.2 Computational methods ...... 13 3.1.3 Symmetry-reduced Lorenz attractor ...... 14 3.2 Code ...... 14 3.2.1 Equation ...... 14 3.2.2 Simulator ...... 15 3.2.3 FitzSimulator ...... 15 3.2.4 PoincareMapper ...... 15 3.2.5 CurveSeparator ...... 16 3.2.6 Example of implementation ...... 17 4 Results and discussion 18 4.1 Rössler attractor ...... 18 4.1.1 Dimensionality reduction on the attractor ...... 18 4.1.2 Return maps ...... 21 4.2 Lorenz attractor ...... 23 4.2.1 Dimensionality reduction on the attractor ...... 23 4.2.2 Return maps ...... 24 4.3 FitzHugh-Nagumo ...... 25 4.3.1 Manifold learning on the attractor ...... 25 4.3.2 Manifold learning on the Poincaré section ...... 27 4.3.3 Return maps ...... 27 4.3.4 Extreme Events ...... 28 5 Conclusions 30 Bibliography 31 A Code 34 A.1 Equation ...... 34 A.2 Simulator ...... 34 A.3 PoincareMapper ...... 36 A.4 FitzSimulator ...... 37 A.5 CurveSeparator ...... 39 1| Introduction

Predicting the future has always been an interesting concept, especially when it comes to events of high signif- icance [1]. Constructing reliable forecasts of, for example, financial crashes [2][3], climate disasters [4][5][6] and earthquakes [7], is of great importance. These are examples of a so called extreme events [8], unpredictable events that far exceed the normal fluctuations of the system. In order to produce such forecasts, we generally design mathematical models based on past observations. This thesis will be focused on such extreme events in a class of dynamical systems known as chaotic dynamical systems [9].

In order to predict such rare events we generally look for small indications that an extreme event is about to occur. This will be done by studying the so called attractor, a geometric object towards which the trajectories of chaotic dynamical systems converge [10]. The aim is to identify parts of the attractor that lead to extreme events.

To do this we will apply computational methods to map the attractor in a lower-dimensional space. Some of these methods are modern techniques of non-linear dimensionality reduction known collectively as manifold learning. These are techniques commonly used in machine learning for feature extraction, which is a preparatory step that reduces redundancies in data sets [11].

Having seen previous use in, for example, image-recognition [12], manifold learning has only seen small usage in mapping attractors of chaotic dynamical systems [13] and has so far seen no application in predicting extreme events in excitable systems. The aim of this thesis is to further analyse how manifold learning maps the attractor of chaotic dynamical systems and see if, by doing so, it could be used to identify criteria for predicting extreme events.

Manifold learning will be tested in conjunction with so called Poincaré maps [14], a method used to construct discrete mappings of continuous-time dynamical systems. In general terms, this is done by taking a Poincaré section, which is a cross section of the attractor, and mapping the position of each intersection of the system with the Poincaré section against the next intersections position. By analysing the structure of the Poincaré map, we also gain insight into the structure of the attractor. Poincaré maps also help to ease computation, as the data sets generated in this thesis are quite large.

The thesis will begin by analysing the so called Rössler and Lorenz attractor, with the goal of understanding how different techniques of manifold learning map the attractor of chaotic systems. As the Rössler and Lorenz attractor are common examples of chaotic dynamics, they also serve as a good introduction to manifold learning of attractors.

However the main goal of this thesis is to study the FitzHugh-Nagumo system, a system that exhibits extreme events. This system is commonly used to model systems of coupled neurons which has a broad range of applica- tions. When Poincaré return maps were created for the FitzHugh-Nagumo system, in an attempt to predict extreme events, it was discovered that they are not optimal, this is the inspiration for this thesis. We hope that by applying manifold learning on this system we will create return maps that can explain the origin of these rare events.

The thesis will begin by giving an introduction to chaotic dynamics where we describe chaotic systems, attractors, extreme events and Poincaré maps. Following this the different manifold learning techniques studied will be introduced. In the methodology we will then describe how the different techniques were implemented. The results from manifold learning on the Rössler and Lorenz attractors will be discussed and used to shape the study of the FitzHugh-Nagumo system, which is the final part of this thesis.

1 2| Theory

The theory will not be a complete explanation of each concept but serve as more of an introduction. Primarily this concerns the techniques of manifold learning as the more technical aspects are not necessary to understand the thesis. Similarly concepts such as dynamical systems, chaos, and attractors will only be introduced and explained at a level of detail relevant to the thesis.

2.1. Chaotic dynamical systems When Henri Poincaré studied the three body problem of classical dynamics in 1880 he noticed orbits which where non-periodic and yet not diverging or approaching a fixed point [15]. This was the first recorded discovery of a so called deterministic chaotic system, a discovery which laid the foundation for the mathematical branch today known as .

As chaos lacks a universally accepted mathematical definition, deciding what makes a system chaotic can vary. In general, a is often classified as a chaotic if it is very sensitive to initial conditions [16]. A common example of this behaviour is in the so called double [17], in which small changes to the result in significant differences to the outcome. This phenomena is often referred to as the butterfly-effect, which originally was used by to explain difficulties of long time weather prediction [18].

Chaotic behaviour can be exhibited in dissipative dynamical systems. In such systems contracts over time. Most commonly these systems contract to so called fixed points or periodic orbits [19], but as in the case of the three body problem studied by Poincaré, a chaotic system converges towards a finite space volume known to as a strange attractor [20].

2.1.1. Strange attractors Originally coined in 1971 [21], a strange attractor is a geometric object for which phase space contracts over time. One of the original studies of such attractors was done by Lorenz who noticed chaotic behaviour when studying a non-linear system modelling the atmosphere [22]. In 1963 Lorenz managed to encapsulate this chaotic behaviour in a system of non-linear differential equations known as the Lorenz system [23]. When studying solution to the system, Lorenz observed an interesting shape which can be seen in figure 2.1.

In comparison to regular attractors, a strange attractor has a structure [20], meaning it has a non-integer dimension. It has been shown that strange attractors can only exist in three dimensions or more using the Poincaré- Bendixon theorem [24], but apart from that there does not exist any consistent theory for when systems exhibits these attractors. Also, since the geometry of the attractor is dependent on the system under consideration, the geometry is generally hard to understand. The attractors studied throughout the thesis will now be presented in detail.

2 CHAPTER 2. THEORY 2.1. CHAOTIC DYNAMICAL SYSTEMS

Figure 2.1: The left image is a simulation of the Lorenz attractor. The right shows the value of the x-coordinate over time, this shows the unpredictability of the transitions between the two lobes.

Lorenz attractor As previously explained, the Lorenz created an oversimplified model of the atmosphere that exhibits chaotic behaviour, defined by the following system of equations,

x˙ = σ(y − x) y˙ = x(ρ − z) − y (2.1) z˙ = xy − βz.

This system will be studied with the values σ = 10, ρ = 26 and β = 8/3, primarily as they are known to yield chaotic behaviour [22]. In figure 2.1, a numerical solution of the equation is presented. As seen, the trajectory seems to alternate between two lobes and form a butterfly-like shape; these alterations have a chaotic pattern as can be seen in the figure. This shape is what is called the Lorenz attractor, it follows a pattern which folds on itself at the origin of the x-axis. It is also able to visualise the the unpredictability of the system since the trajectory seems to move in an irregular manner between the two lobes.

Figure 2.2: The left image is a simulation of the Rössler attractor. The right image is the value of the vertical coordinate over time, this hints at a chaotic behaviour as there is no discernible pattern.

Rössler attractor Another attractor of interest is the so called Rössler attractor. Originally named and studied by Otto Rössler, it is defined by the following system of equations,

x˙ = −y − z y˙ = x + ay (2.2) z˙ = b + z(x − c).

3 CHAPTER 2. THEORY 2.1. CHAOTIC DYNAMICAL SYSTEMS

The system will be studied for the values a = 0.1, b = 0.1, and c = 14, which, similar to the Lorenz system, also lead to chaotic dynamics [25]. The Rössler system was originally constructed as a modification to the Lorenz equation, but has later proved to be useful in modelling the equilibrium of chemical reactions [26]. As visualised in figure 2.2, we see trajectories that sometimes deviate along the z-axis from the basis of a circular shape and then fold back on itself. Just as in the case of the Lorenz system, these trajectories show unpredictable, although deterministic behaviour.

2.1.2. Extreme events and excitability When an observable, which usually fluctuates within certain limits around some well defined mean value, exhibits sudden excursions to significantly deviating values, then a so called extreme event has occurred. Such extreme events are often observed in a class of dynamical systems known as excitable systems. The excitability of a system is a measure of its capacity for creating large pulses in response to an internal or external stimulus [8]. The standard approach of studying these events is called extreme value theory, and is based in statistics and probability [27]. However, extreme events have recently been studied in deterministic chaotic dynamical systems [8]. These systems provide a simpler paradigm for the study of extreme events.

2.1.3. Poincaré return maps When studying a strange attractor it is generally of useful to have a concise description of the flow. Instead of de- scribing the entire phase space, the attractor can be defined using a simpler discrete time dynamic. Poincaré sections is a method of creating such a discrete time dynamic of phase space [14] without a general loss of information about the flow.

A Poincaré section is a hypersurface on which directional intersections are recorded. This could in an analogy be described as a one directional cross-section of the flow in phase space. By recording the coordinates of each intersection of a trajectory, a discrete mapping of the trajectories is encapsulated in the Poincaré section. From this mapping of coordinates it is possible to reconstruct the full space trajectory by integration in time.

More formally a Poincaré section is defined as a local transversal and differentiable (d-1)-dimensional hy- persurface P in the d-dimensional state space M. Vi- sualised in figure 2.3, we see an example of a Poincaré section of the Rössler attractor. In this case the sec- tion is defined as a plane along the z-axis and the angle π/4 in the xy-plane. How the Poincaré sec- tions should be defined is mainly subjective and cen- tered around the objective sought out of the map- ping [9]. The common practice is to use hyper- planes as Poincaré sections, primarily to avoid prob- lematic topology [28] and otherwise difficult calcula- tions [29].

By having successive intersections with a Poincaré section, it is possible to describe the intersections as a discrete time-series. To understand the behaviour of a system, one could study how the interactions change n when iterating. By letting {x}i=1 xi ∈ P be the series of interactions we define the the Poincaré return map P Figure 2.3: Example of a Poincaré section of the Rössler attractor with the intersections on the section plotted to the right. by xi+1 = P(xi), i = 1, 2, .., n − 1. The Poincaré return map is thereby a mapping between every intersection with the Poincaré section and the next. To analyse the Poincaré return map, it is also common to project the values according to some scalar function M : Rd → R, such that the return map can be plotted in a two-dimensional plot. In such a way that the relation in the graph becomes M(xi+1) = P(M(xi)).

In figure 2.4 we see a common example of implementation of Poincaré map on the Rössler attractor [30]. Here the

4 CHAPTER 2. THEORY 2.1. CHAOTIC DYNAMICAL SYSTEMS

radial distance is used as the scalar function, as such it is written M(xi) = ri. As shown the return maps tend to show similar results. This is not a coincidence, in fact it is one of the main advantages of using return maps. The reason behind this similarity is that the topological relation between orbits should be similar throughout the attractor, even when changing the section. But as seen in section B, there seem to be some inconsistencies. In this section we see an example of a multi-valued return map, which means what the curve spanned by the points is not single-valued. This is problematic as points could be related to multiple curves of origin. In contrast, sections A,C,D result in a so called single-valued return maps, and do not have the same problem as before.

Also shown in figure 2.4, one observes the line ri = ri+1. This curve is useful, and is commonly visualised with return maps to see where the system has periodic orbits. It also helps explaining the behaviour of the system. In the case of figure 2.4, it explains that the radius tend to increase following iteration when the radius is low, and decrease when large. This can be seen as orbits with larger radius fold back towards the origin.

Figure 2.4: Poincaré return maps of the Rössler attractor based on 4 different sections. In this case the scalar function is defined as the radial distance.

This problem of multi-valued return maps can be seen as a consequence of how the scalar function is defined and how the section is chosen. The problem arises, as seen in 2.4, at the fold of the attractor. It is debatable if having return maps similar to that, is sufficient for describing the chaotic behaviour of the system [31][32], but it seems highly dependent on the problem at hand. As it is generally hard to define good scalar functions and sections in such cases, a systematic solution to this problem would be of interest. This is one of the main problems that might be solved using manifold learning, and is one of the primary applications of these techniques in the thesis.

5 CHAPTER 2. THEORY 2.2. FITZHUGH-NAGUMO

2.2. FitzHugh-Nagumo

Figure 2.5: Plot of the FitzHugh-Nagumo system, the left and center graphs are the two oscillators in this particular case of the FitzHugh-Nagumo system. The rightmost graph is the y1 coordinate over time showing the chaotic behaviour.

The final system under consideration is the FitzHugh-Nagumo system, which is most commonly used to model signals between [H] neurons in the brain [33]. These neurons become excited when a voltage that exceeds a certain threshold value is applied to them. The FitzHugh-Nagumo model is a scalable chaotic system, which means that it can be studied in an arbitrary number of dimen- sions by adding more neurons to it. The FitzHugh-Nagumo sys- tem with n coupled two-dimensional oscillators is defined by the following system of equations, n x˙i = xi(ai − xi)(xi − 1) − yi + k ∑ Aij(xj − xi) j=1 (2.3)

y˙i = bixi − ciyi, Figure 2.6: Poincaré return map of the FitzHugh-Nagumo sys- where ai, bi and ci are internal parameters of the unit, k is the tem. The scalar function is in this case defined as the euclidean coupling strength, and A ∈ {0, 1}n×n is the symmetric adjacency norm. matrix (Aij = Aji = 1, if and only if units i and j are coupled, if not, they are zero). This matrix encapsulates information regarding the connections between neurons. The system will be studied for two coupled oscillators with parameters a1 = a2 = −0.025794, b1 = 0.0065, b2 = 0.0135, c = 0.02, k = 0.128 and Aij = Aji = 1 ∀i 6= j as this is parameters previously studied [8].

In higher dimensional cases, the geometry of the system is hard to analyse as it is difficult to visualise. But as shown in figure 2.5, we see a two-dimensional representation of the two oscillators of the studied and a time series of a single variable of the model, we clearly see events deviating from the general oscillating pattern. These events seem to occur irregularly and unlike the oscillating pattern of the Rössler shown in figure 2.2 , does not appear in a continuous spectrum of amplitudes. It is thereby what would be classified as extreme events.

Poincaré return maps have not been used to analyse extreme events in the FitzHugh-Nagumo system before, an attempt to use them for that is what prompted the creation of this project. It was hoped that these maps could identify extreme events and events leading up to them. But as it is shown in figure 2.6, this becomes problematic as the return map is multi valued. This means that every point in the return map can map to multiple different points. We will research if manifold learning could separate these two curves and use this to further analyse extreme events. This method could have applications beyond the FitzHugh-Nagumo system, primarily for other systems with multi valued return maps such as the Kuramoto-Sivasinsky system [34].

2.3. Dimensionality reduction Analysing large high-dimensional data sets tend to be very difficult. To make the analysis easier it is common to use techniques that reduce the number of dimensions in order to create a lower-dimensional representation of the

6 CHAPTER 2. THEORY 2.3. DIMENSIONALITY REDUCTION original data set known as an reembedding. These techniques are collectively called dimensionality reduction techniques [35].

More formally let F be a m-dimensional space, with generating function f : Rd → Rm (d < m) that given the data m d set X = {x1, x2, .., xN}, xi ∈ R , construct a reembedded data set y = {y1, y2, .., yN}, yi ∈ R such that xi = f (yi). As the generating function is the relation between the original data set and the reembedding, dimensionality techniques seek to defining f in manners that maintain properties of the original data set.

Currently dimensionality reduction techniques can be classified as either linear or nonlinear. In this thesis we will mainly focus on manifold learning, which is a form of nonlinear dimensionality reduction. However for comparison we will also present one of the most common linear dimensionality reduction technique.

2.3.1. Linear dimensionality reduction Linear dimensionality reduction generally works through the use of optimisation of variance between the reembed- ding and original data set [36]. This is usually achieved through some form of eigenvector decomposition. Examples of such techniques are principle component analysis (PCA) [37], linear discriminant analysis (LDA) and canonical corre- spondence analysis (CCA) [38]. Here we briefly explain PCA since it is one of the most popular linear methods, but also helps to understand many of the nonlinear methods.

2.3.1.1 Principal component analysis In order to better represent a data set PCA computes new variables which are linear combinations of the old variables. The new variables are called principal components. Each principal component maximises the variance of all variables of the data under the constraint that it has to be orthogonal to all previous components. When describing PCA, I is the number of observations, J is the number of variables and X is a the centered data set with dimension I × J.

To maximise the variance is the same thing as minimising the distance to all observations. A principal component can therefore be calculated by minimising the squared distance of each observation I in X along all dimensions. However the principal components are usually obtained by using singular value decomposition of X, where X = P∆QT. The matrix Q is the coefficients of the linear combinations [37].

The importance of a principle component can be calculated be dividing its inertia by the total inertia [37]. The inertia is defined as 2 2 γj = ∑ xi,j, (2.4) i 2 2 where xi,j is the j variable in which observation i varies. The total inertia is defined as the entire γ column. [37] If the importance of all used PC:s is large it means that using the new representation captures most of the variation of the data set.

2.3.2. Manifold learning In general manifold learning tries to maintain the structural properties of the data set. This could for example be by maintaining the geodesic distances in local neighbourhoods of points [39] or aligning the tangent space between points [40]. In both scenarios we want to unfold the structure of the data set according to some non-linear trans- formation. The different dimensionality reduction techniques studied in this thesis are visualised in figure 2.7. The object in the figure is a so called Swiss roll, a two-dimensional structure embedded in three dimensions. Manifold learning techniques seek to unfold the structure, reembedding it onto two-dimensions.

7 CHAPTER 2. THEORY 2.3. DIMENSIONALITY REDUCTION

Figure 2.7: Manifold learning on a so called Swiss roll using the techniques that will be used throughout the thesis. Colouring represent the radial distance from the center of the Swiss roll. As seen, manifold learning unfolds the structure giving a representative two-dimensional reembedding.

The Swiss roll provides an intuitive example of the application of manifold learning. However manifold learning is especially useful when implemented on high-dimensional data sets. In that case we can lower the dimensionality of the data sets in order to create a better, easier to analyse, embedding.

In this thesis we will use the manifold learning techniques Isomap, locally linear embedding (LLE), modified locally linear embedding (MLLE), hessian eigenmapping, local tangent space alignment (LTSA) and t-distributed stochastic neighbour embedding (t-SNE) to analyse our attractors.

2.3.2.1 Isomap The Isomap (isomorphic mapping) technique is one of the first nonlinear dimensionality reduction techniques, dis- covered in 2000 simultaneously with locally linear embedding. Isomap is very similar to PCA, with the difference that it uses the geodesic distance on the manifold to find the new dimensions. The geodesic distance is approxi- mated by summing the euclidean distances along a path of neighbouring points. This makes it possible for Isomap to detect the underlying structure of, for example, a Swiss roll shape, (see figure 2.7) something PCA fails to do.

The method can be divided into three steps [41], where the input space is denoted X.

1. Construction of k-nearest neighbourhood This can either be done by letting all the k closest data points be neighbours or letting all data points within a radius e be neighbours. Construct the graph G, by setting an edge with the euclidean distance, dX(ij), between the neighbouring points i and j.

2. Approximate the geodesic distance Initially set the edges dG(ij) = dX(ij) if i and j are neighbours, and to ∞ otherwise. G is then updated by using a shortest path algorithm, setting dG(ij) to the shortest distance it takes to get from point i to point j going through neighbouring points.

8 CHAPTER 2. THEORY 2.3. DIMENSIONALITY REDUCTION

3. Apply PCA on the approximated geodesic distance The underlying structure of the input space is found by applying PCA, defining the distance as the geodesic distance between all data points.

2.3.2.2 Locally linear embedding Locally linear embedding (LLE) is a technique that tries to preserve local angles in order to preserve the topology of the data. The main idea of the technique is to choose a local neighbourhood of points around each point, and then describes each point as a linear combination of its neighbours. The constants in this linear combination, notated as weights wi , are limited to ∑i wi = 1 , and when later constructing the embedding these weights are what is preserved.

The first step is to perform a k-nearest neighbourhood search, same as for Isomap. Using these points the matrix W is computed, this matrix contains the weights that when multiplied with the neighbours of a point recreate said point. The weight matrix W is the constructed through minimising the reconstruction error, a measure of how closely the weights reconstruct the data, defined as

N 2 E(W) = ∑ ||~yi − ∑ wij~yj|| . (2.5) i=1 j∈Ni

N Ni is the set of points neighbouring point yi. This function is constrained such that ∑j=1 wij = 1, i.e. the sum of the weights reconstructing each point is equal to one. In order to compute the weights the reconstruction error for each point can be computed separately as follows

2 2 Ei(W) = ||~yi − ∑ wij~yj|| = || ∑ wij(~yi − ~yj)|| = ∑ wijwil gijl, (2.6) j∈Ni j∈Ni j,l∈Ni T gijl = (~xi − ~xj) (~xi − ~xl) is the elements of the Gram matrix Gi. Using this the weights can be computed as

−1 ∑ Gi l∈N = i wij −1 (2.7) ∑ Gi j,l∈Ni

[35]. The lower dimensional embedding, Xˆ , is then computed by setting up a similar cost function,

N N N ˆ 2 2 T Φ(X) = ∑ ||~xi − ∑ wij~yj|| = ∑ || ∑ wij(~xi − ~xj)|| = ∑ mij(~xi ~xj). (2.8) i=1 j∈Ni i=1 j∈Ni i,j=1 T Where mij are the elements in M = (I − W) (I − W) and I is the identity matrix of dimensions N. Constraints are N ~ put on the optimisation, examples of this include ∑i=1 ~yi = 0, in order to have the embedding centered around the 1 ˆ ˆ T origin and unit covariance, N YY = I, in order to have some kind of consistent scaling. The optimisation is done by calculating the d + 1 bottom eigenvectors of the matrix M, where d is the dimension of the target space [35]. Bottom eigenvectors meaning those associated with the lowest eigenvalues. We calculate the d + 1, rather than d, bottom ones as the lowest is a scaled unit vector and not relevant to the embedding.

LLE has several issues, the most prominent one being regularisation. When constructing the weight matrix, if the number of neighbours is larger than the number of input dimensions, the matrices defining each local neighbour- hood, Gi, are rank-deficient [42]. One method used to avoid this problem is to use a regularisation parameter r.

0 Gi = Gi + rI (2.9)

9 CHAPTER 2. THEORY 2.3. DIMENSIONALITY REDUCTION

This however is not a perfect method as it introduces a slight error to the calculation, even when 0 < r << 1.

There exist other methods that solve this problem, such as Modified Locally Linear Embedding (MLLE) and Hessian eigenmapping. MLLE only differs in how the weight matrix is constructed, and Hessian eigenmapping instead uses an estimate of the Hessian matrix to construct the embedding [43][44].

2.3.2.3 Modified locally linear embedding MLLE tries to find multiple, different, weight vectors for every point and combine them in order to generate an embedding.

(l) wi = (1 − αi)wi(γ) + Vi Hi(l), l = 1, ..., si, (2.10) where si = k − d and d is the target dimension. This is not necessarily the optimal choice but in general it is a good choice. With α = √1 ||v || and V being the eigenvalue matrix of the input data, H is a Householder matrix that i si i i i satisfies

T ~ ~ HiVi 1k = αi1si . (2.11)

Where 1m is a vector of length m consisting of only ones. From this a cost function can be created for the embedding,

N si ˆ (l) 2 Φ(Y) = ∑ ∑ ||~yi − ∑ wij ~yj|| . (2.12) i=1 l=1 j∈Ni

This can in a similar way to standard LLE be rewritten as

ˆ ˆ T ˆ T Φ(Y) = Tr(Y ∑ WiWi Y ). (2.13) i

T Finally the d + 1 lowest eigenvectors of ∑i WiWi are computed and these, disregarding the lowest one, form the new embedding [43]. The advantage of this technique is that it does not suffer from the regularisation problem while not having the induced error that is caused by the solution to the problem for LLE.

2.3.2.4 Hessian eigenmapping This technique estimates the local hessian of a function at each point and its K nearest neighbours in order to construct the embedding, instead of computing the weights as in LLE and MLLE.

Like the other methods the first step is to compute the K nearest neighbours. Then a matrix Mi is constructed for each point, where the rows are the points in the neighbourhood with the local average (over the neighbourhood) subtracted from each. A singular value decomposition is performed on Mi, from this we get the matrices U, D, v. U contains the tangent coordinates of the input data in its first d columns [44].

By then assuming that the generating function f is C2 and smooth we are able to estimate its Hessian using the least square method. From these Hessian matrices we then compute the symmetric, quadratic form,

l l Hi,j = ∑ ∑ Hr,i Hr,j (2.14) l r and calculate the eigenvalues of it. The embedding is then the eigenvectors associated with the d + 1 to second lowest eigenvalues, the same as for standard LLE and MLLE [44].

10 CHAPTER 2. THEORY 2.3. DIMENSIONALITY REDUCTION

2.3.2.5 Local Tangent Space Alignment Similar to the LLE, local tangent space alignment, abbreviated as LTSA, modifies the construction of the similarity- matrix W to maintain local geometry through the alignment of tangent spaces [40]. In reference to LLE, which constructs its weight matrix based on linear combinations of neighbourhoods, LTSA further approximates the geometric structure by aligning the tangent space of each neighbourhood.

By assuming that the generating function f is locally smooth and through the use of the Taylor expansion around a fixed point y¯ ∈ Rd : x¯ = f (y¯) we have:

2 f (y) = f (y¯) + Jf (y¯) · (y − y¯) + O(|y − y¯| ). (2.15)

where Jf (y¯) is the Jacobian of f at the point y¯. As f is unknown so is also Jf , but it is able to approximate Jf (yi) through estimating the tangent space Tx¯ from the data set X. This serves as the basic idea of LTSA, which by taking this tangent space into account when constructing the weight matrix, creates an embedding maintaining tangent spaces. It should be noted that LTSA does not maintain the geodesic distances in doing so, which can be problematic in some implementations.

LTSA is generally used more often than regular LLE, primarily as it is more adaptable to maintain the general structure of a data set. This has for instance implementation in facial recognition [45], where it is of importance to maintain the the local geometry of the face. However it has some drawbacks, primarily the extra computational cost and data-allocation, and the fact that it does not preserve geodesic distances.

2.3.2.6 t-distributed Stochastic Neighbour Embedding In contrast to linear embeddings and linear dimension reduction t-distributed stochastic neighbour embedding, abbreviated as t-SNE, aims to preserve the probabilistic distribution between points. By approximating a Gaussian distribution on the initial data-distribution [46], t-SNE constructs an embedding through a reflective student t- distribution [47] in a lower dimensional space.

When constructing the approximate Gaussian distribution G, let {X = x1, x2, ..., xN} be the initial data points in a D-dimensional space. The conditional probability that, given the point xi , the point xj (i 6= j) would be chosen as a neighbour in proportion to a Gaussian probability density centered around xi would be estimated by

2 2 exp(−kxi − xjk /2σ ) p = i . (2.16) i|j 2 2 ∑k6=j exp(−kxi − xkk /2σi )

In the equation, σi is the approximate bandwidth of the Gaussian kernel and ||·|| is the defined norm, in most cases 2 the L -norm. Element pi,j of the Gaussian matrix G is later defined as

pi|j + pj|i p = , (2.17) i,j 2N which is the approximate probability that point xi would be a neighbour of point xj. The probability is also set to pi,j = 0 ∀i = j since, based on modelling purposes, a point should not be its own neighbour.

Given the reduced dimension d, we now seek to construct the reflective student t-distributed matrix Q based on of d N-dimensional array of points{y1, y2, ..., yN} where yi ∈ R . The distribution of Q should reflect the probabilities in G with the elements qi,j ∈ R, being the approximate t-distributed probability, defined as:

(1 + ky − y k2) = i j qi,j 2 . (2.18) ∑k6=l(1 + kyi − ylk )

Optimising yi is done by minimising the Kullback–Leibler divergence [48] of Q based on P defined as:

pi,j miny(P||Q) = ∑ pi,j . (2.19) i6=j qi,j

11 CHAPTER 2. THEORY 2.3. DIMENSIONALITY REDUCTION

N Using the resulting set {yi}i=1 that minimise the Kullback-Leibler divergence we get a set that reflects the original distribution of points, but in a lower embedded dimension. The initial distribution in the embedding is freely chosen but commonly it is done through PCA.

As visualised in figure 2.7 we see that t-SNE manages to cluster related data-points in the embedding, but fails in capturing the geometry of the Swiss roll. This is, in conjunction with extra computation power, one of the primary drawbacks of t-SNE [46].

In general t-SNE often require optimisation to handle larger data sets [49] t-SNE is thereby primarily used for separating input of relatively small data sets where the common factors is hard to separate analytically, this could for example be in classifying rashes for skin-cancer [50].

Throughout this thesis the dimensionality reduction techniques will be used in combination with Poincaré sec- tions in order to create useful and reliable return maps of the attractors. How these techniques and methods are implemented will be explained in the approach and implementation chapter.

12 3| Approach and implementation

Here we will describe our approach and methodology, followed by the implementations of the different techniques described in sections 2.1.3 and 2.3.

3.1. Approach In this section we will describe our main methodology, how we aim to conduct our study, the general idea behind out implementation and some more specific aspects regarding our study. Some decisions will be discussed in chapter 4, Results and discussion, as they are motivated by the results.

3.1.1. Methodology Our plan is to attempt two different methods. One is to apply dimensionality reduction on the d-dimensional at- tractors directly, approximating them to a lower dimensions, and then take a Poincaré section from the reembedded system, giving us one-dimensional data. The other method was to take a Poincaré section of the attractor, reducing its dimensionality from d to d − 1, and apply manifold learning on the points on the section. This data would then be used to create return maps. The maps will then be studied in order to determine if there is an improvement, for example if the new return map is single valued while the original data is not.

Initially we studied these for Rössler and Lorenz using all the methods described in chapter 2.3. From this we could choose methods that are appropriate for studying the FitzHugh-Nagumo systems.

3.1.2. Computational methods Because the systems that we studied can’t be solved analytically we used differential equation solvers to simulate the systems. However as most solvers don’t have a set step size the points are not evenly distributed along the trajectory. In order to rearrange the data set we use interpolation.

Because finding initial conditions on the attractor requires simulation, we chose initial conditions close to the attractor and after simulating the system a set of points in the start of the data set was removed. Those points were removed due to the fact that it takes some time for the trajectory to converge to the attractor. As our analysis focused on the overall structure of the attractor this is no limitation for our study.

All code for this project is written in Python. The methods used for integration as well as the different techniques for manifold learning were implemented using preexisting packages. Integration and interpolation used methods found scipy, manifold learning and linear dimensionality reduction uses methods from sklearn and plotting was done using matplotlib. Along with these packages numpy and the native math packages were used.

How we implemented the solutions to the problems described here can be found in section 3.2 with the full code in the appendix.

13 CHAPTER 3. APPROACH AND IMPLEMENTATION 3.2. CODE

3.1.3. Symmetry-reduced Lorenz attractor When manifold learning techniques were first implemented on the Lorenz attractor Hessian eigenmapping and LTSA generated errors. This was believed to come as consequence of the tangent and hessian spaces being hard to estimate in the intersection of the two lobes of the attractor. To avoid this problem it was decided to fold the Lorenz attractor using doubled-polar angle coordinates, which is a previously used approach to study the Lorenz attractor see Ref. [51], chapter 11. This transformation of variables is defined as

(xˆ, yˆ, z) = (rcos2θ, rsin2θ, z) = ((x2 − y2)/r, 2xy/r, z), (3.1)

p where r = x2 + y2. This effectively folds the two lobes of the attractor to one. To simplify computations the folded Lorenz attractor will also be centered around the origin. In figure 3.1 the folding is visualised, to make comprehending the fold easier, the colours have been preserved.

Figure 3.1: The Lorenz attractor before and after the symmetry reduction, where it in the symmetry reduced version also has been and centered at the origin. The colouring represent the value of the z-coordinate in the original attractor and is preserved after the symmetry reduction.

3.2. Code Here we will describe the most important parts of the code written for this project, examples of how to implement the different classes can be found in chapter 3.2.6, here we will only present excerpts, the full code can be found in the appendix.

3.2.1. Equation The first class is the Equation class, it was designed to ease implementation and creates functions for the systems we were testing on based on a set of parameters. The first class is an abstract class, this is a recipe for adding new equations in the same way. derivatives is where the derivatives are defined and inCond are predetermined initial conditions that can be used in the Simulator class but other initial conditions can be used.

class Function(ABC): @abstractmethod def derivatives(self,t,state): pass

@abstractmethod def inCond(self):

14 CHAPTER 3. APPROACH AND IMPLEMENTATION 3.2. CODE

pass

Here we have an example of how the Lorenz system was implemented in this class. The values of the constants can be changed, as optional arguments, the ones implemented here are the ones we used for the study.

class Lorenz(Function):

def __init__(self,values = [26.0,10.0,8.0/3.0]): self.rho = values[0] self.sigma = values[1] self.beta = values[2]

def derivatives(self,t,state): x,y,z = state return self.sigma * (y - x), x * (self.rho - z) - y, x * y - self.beta * z

def inCond(self): return [1.0,1.0,1.0]

3.2.2. Simulator The Simulator class is used to generate data from any of the equations in Equation. It uses the Equation class and the classes defined in it as the input function. How this is done can be found in section 3.2.6. It uses the solve_ivp function in the scipy package, with Runge-Kutta 45 set as the method of integration. If the user does not define their own initial conditions the ones defined in each equation is used.

This class also includes the interpolateCurve method in order to have more evenly distributed points along the trajectory as discussed in section 3.1.2. It creates a new variable s that represents the distance along the trajectory. It then interpolates the trajectory in segments of 6 points using 5th order interpolation, in order to match the integration, from this it gets points that are mostly equal distances apart. It is not perfect but it is an improvement over the original data.

It should be noted that since this is a numerical solver of chaotic dynamical systems, errors will occur and by definition of chaos diverge into different solutions dependent on the sampling rate. It should here be noted that since the attractor is of general interest, the results are not dependent on that the trajectories being entirely correct. When solving these systems no such differences in results was seen, but it is still important to take into consideration.

3.2.3. FitzSimulator The FitzSimulator class might seem unnecessary as we already have the simulator class. But as the project shifted towards studying the FitzHugh-Nagumo system we discovered that the solve_ivp function had problems when integrating the FitzHugh-Nagumo system. Because of this we used the odeint function instead.

In this class we define the FitzHugh-Nagumo system directly and do not rely on the Equation class. As the odeint function calls functions with parameters (state,t) whilst solve_ivp uses parameters (t,state).

This class also includes the interpolateCurve method that can be found in the Simulator class.

3.2.4. PoincareMapper The PoincareMapper class creates a Poincaré section given a set of data and the normal vector of the hyperplane of the Poincaré section.

The main method in this class is the map method. It loops through all the points in the data and if it finds a crossing between two of them using the crossing method it uses the interpolate method to compute a point close to the intersection.

The crossing method uses the fact that the sign of the dot product of the normal of the plane and a point is different

15 CHAPTER 3. APPROACH AND IMPLEMENTATION 3.2. CODE on either side of the plane.

In order to find a point close to the Poincaré section we interpolate the curve based on the points surrounding the intersection and then find the point closest to the section. We chose not to use interpolation in conjunction with methods of finding zero, such as Newton-Raphsons, as these methods are limited in the number of dimensions they can handle in Python.

One limitation of the code is that it does not handle affine planes, this is however easily fixed by subtracting a vector between the origin and the affine plane from the data. Furthermore it returns the data in the original dimensionality, i.e. four-dimensional input data returns four-dimensional output data.

3.2.5. CurveSeparator As will be shown in section 4.3.3, the manifold learning techniques were not sufficient to separate the non single valued return map of the FitzHugh-Nagumo system. The CurveSeparator class was instead constructed in an attempt to do so. As curves in some intervals of the return map are so close together that it is impossible to determine which points belong to which curve, see figure 3.2, the points in these sections are conventionally assigned to the lower curve.

Figure 3.2: The return map of the FitzHugh-Nagumo system before and after the curves has been separated. The black lines in the left plot is the separators, separating the upper and lower curve from each other. In the right plot light and dark blue denote the point on the upper and lower curve, respectively.

To define which points that belong to the upper and lower curve, two separators, one for each branch (left and right), were constructed. These are imaginary lines between the curves, visualised in the left plot of figure 3.2. To construct these the start point of both separators was chosen to be a point between the top of the upper and lower curves and the end points to be the points furthest to the right and left. The curves were then divided into small vertical intervals from the top to the bottom. In each interval the points furthest to the left and right on each branch were found and the mean value of these were calculated and used to define the separators. To determine if a set of points are above or below a given separator the slope between each par of points was also calculated.

16 CHAPTER 3. APPROACH AND IMPLEMENTATION 3.2. CODE

Using the separators each point was then defined to belong to the lower curve if it either lied below the separator or belonged to a section were the to curves were said to be inseparable.

Using the points on the separator and their slopes it could be determined if a point was above or below the separator. If it met the following condition it was set to be on the lower curve

ypoint < yseparator + kseparator(xpoint − xseparator), (3.2) where xpoint and ypoint are the x and y coordinate for the given data point, xseparator and yseparator are the nearest x and y coordinates on the separator and kseparator is the slope of the separator at the point.

To determine if the two curves were inseparable in one of its vertical intervals, the mean of of all points above and below were calculated respectively. The interval was then said to be inseparable if the distance from both of these points to the closest point on the continuous separator were lower than a tolerance. As the separator only consists of discrete points, the closest points on the separator were defined as the intersections of the curves starting in the mean of the points above or below the separator and the discrete point closest above it on the separator. The slope of the separator was set to the slope calculated for its given interval and the slope of the mean of the points above and below the separator were set to be its inverse.

3.2.6. Example of implementation The classes described above are meant to be used in conjunction with each other. Here we will show a few examples of how this might be implemented. The first part of the code is simply the imports required, then a vector normal to a line is defined, we define our equation, simulate and interpolate the curve. Then the Poincaré section is defined and computed. Then it is reembedded using MLLE, note the output err, this is the reconstruction error, see section 2.3.2.3. Then a return map is plotted using the reembedded Poincaré section.

import Equation from Simulator import Simulator from PoincareMapper import PoincareMapper import numpy as np from sklearn import manifold import matplotlib.pyplot as plt

normalVector = np.array([1,1,0])

rossler = Equation.Rossler() simulator = Simulator(rossler)

data = simulator.states(duration = 1000, split = 0.01, interpolate = True)

mapper = PoincareMapper(normalVector,data) poincareSection = np.array(mapper.getValues())

reembeddedPoincare, err = manifold.locally_linear_embedding(poincareSection,n_neighbors = 10, n_components =←- 1, method = 'modified ')

fig = plt.figure() ax = fig.add_subplot(111) ax.scatter(reembeddedPoincare[:-1],reembeddedPoincare[1:]) ax.set_xlabel('$x_n$ ') ax.set_ylabel('$x_{n+1}$ ') plt . show ()

There are two classes that are not included in the above example, FitzSimulator and GraphSeparator. When studying the FitzHugh-Nagumo system the Simulator class can be replaced by FitzSimulator, it has different inputs put the same functionality.

fitz = FitzSimulator(n = 2) data = fitz.states(duration = 100000, split = 0.1)

17 4| Results and discussion

Due to the approach chosen, where results are used to decide how to proceed with the study, the results will be presented in conjunction with the discussion. For each system results from applying dimensionality reduction directly on the attractor and on the Poincaré section will be presented. We will begin with the results from the Rössler and Lorenz attractors and evaluate the different manifold learning techniques. Taking these results into account we will analyse the FitzHugh-Nagumo system focusing on extreme events.

4.1. Rössler attractor Due to the structure of the Rössler attractor, it being centered around a single unstable equilibrium, it is easy to study using Poincaré sec- tions. It was therefore our choice for initial analysis. The data set was generated from simulating the differential equation with initial values x~0 = [1, 1, 1] for a time period of t ∈ [0, 400] with a sampling rate of ∆t = 0.04. The first 1000 values were then removed as described in section 3.1.2.

4.1.1. Dimensionality reduction on the attractor Figure 4.1: The simulated Rössler attractor. As many techniques of manifold learning such as LLE and Isomap re- quire constructing k-nearest neighbourhoods, understanding how the reembedding is dependent on k is necessary. In figure 4.2 we visualise how the LLE reembedding of the Rössler attractor changes according to the amount of neighbours. As seen, the reembedding tends to better represent the attractor when using a larger number of neighbours. This pattern was seen in all techniques of manifold learning relying on a k-nearest neighbours algorithm. It is also important to note that no manifold learning technique take orientation into consideration. Therefore, the reembeddings may be rotated or mirrored as observed in figure 4.2.

Figure 4.2: Visualization of LLE reembeddings with different number of neighbours. Performed on the same data, the reembeddings show improvement when increasing the amount neighbours. Coloring is in reference to figure 4.1

18 CHAPTER 4. RESULTS AND DISCUSSION 4.1. RÖSSLER ATTRACTOR

To explain the dependence on k, one may consider that when increasing the amount of neighbours, manifold- learning algorithms tend to better take local geometric structure into consideration. This local geometric structure is in contrast, as seen in figure 4.2, largely dismissed when having a low number of neighbours. However, more neighbours increase the computational load, and does not seem to improve results after a certain limit. Further anal- ysis will therefore use approximately 10 neighbours as this was shown to be sufficient and not too computationally expensive.

Another problem initially encountered was that manifold learning showed high dependence to how the data set is distributed along the attractor. The fact that points from the simulation were more dense in the x-y plane, compared to the self-folding peaks, lead to inconsistencies when applying manifold learning techniques. This is visualised in figure 4.3, where small changes to the sampling rate are observed to yield different reembeddings. This example was created using standard LLE with 10 neighbours, but the behaviour was also observed using the other manifolding learning techniques.

Figure 4.3: Visualisation of LLE reembeddings when varying the sampling rate of the data. As can be seen, the resulting reembeddings differ and does not give consistent results. There also don’t seem to be improvement when increasing the sampling rate. Colouring is in reference to figure 4.1

To reduce these inconsistencies we redistributed the points in order to even out the distances. This was done by interpolating the curve using the method described in 3.2.2. As such, we redefined the trajectory as a function of distance rather than time. Visualised in figure 4.4, the reembeddings show more consistent results when varying the sampling rate which implicitly also changes the distance between points. It is therefore believed that redistributed data sets yield better results when using manifold learning. However the interpolation is not perfect as it might introduce slight errors that affect the resulting reembeddings.

Figure 4.4: Visualisation of LLE reembeddings after a redistribution of the numerically integrated data sets. The colouring represent the position of each point along the z-axis in original data set. ∆t is the sampling rate used which implicitly defines the distance between the points. The results show consistent reembeddings of the Rössler attractor, even if large sampling rates yield relatively small data sets. Colouring is in reference to figure 4.1

Having decided the parameters for the methods, the following step is to implement and compare different tech- niques of manifold learning. Visualised in figure 4.5 we see the results from applying dimensionality reduction on

19 CHAPTER 4. RESULTS AND DISCUSSION 4.1. RÖSSLER ATTRACTOR the data set. The different reembeddings seem to capture the general geometry of the attractor.

It should be noted that it should be expected to have problems arising at the fold of the attractor. As the Rössler attractor is a strange attractor, a concrete two-dimensional mapping is impossible as the fractal dimensions exceed two-dimensional space. It should thereby be argued that, the results should be seen as different approximations of an attractor by a manifold of integer dimension.

Figure 4.5: Dimensionality reduction on the Rössler attractor based on the redistributed simulated data set. Parameters used 10 neighbours and t-SNE was initiated using PCA. In general all techniques of manifold learning capture the geometric structure of the attractor. In the case of Isomap, LLE, MLLE and PCA it is observed that the reembedding has an overlap of the data set at fold of the attractor. LTSA and Hessian LLE contract to a single intersection at the fold. t-SNE does not overlap or contract, but does yield a discontinuity at the fold of the attractor. Observe that dimensionality reduction techniques do not take orientation into consideration and can therefore give differently rotated reembeddings. The scale of the reembeddings varies due to the way the different techniques work; Isomap and PCA preserve distances, locally linear methods normalise the data set and t-SNE has a fixed scale on its axis.

In the case of PCA, Isomap, standard LLE and modified LLE, the reembedding is observed to give a good topo- logical representation of the attractor. However, an overlap of data points can be observed at the region in which the attractor folds. This could be seen as inconsistent with what would be expected out of a two-dimensional reembedding, but as previously mentioned, it should not come as a surprise

In LTSA and Hessian LLE there is no overlap, but the reembedding does in comparison contract at the fold of the attractor. It is believed that this contraction comes as a consequence of the tangent and hessian spaces being difficult to estimate at the fold. These reembeddings can thereby be seen as misleading as they do not reflect the topology of the attractor.

There is also a loss of topological information when using t-SNE. This is probably due to t-SNE trying to cluster the points and in the end this causes the trajectories to separate from each other. This is because of the underlying stochastic model of t-SNE disregarding the topology.

The different methods also scale the reembedding differently. In the methods of LLE it is seen that the reembedding is in small-scaled axes and t-SNE seems to enlarge the axes. Only in the case of Isomap and PCA does the axis seem to reflect distances in the original embedding. This should not come as a surprise due to that Isomap and PCA work through maintaining geodesic distances in contrast to LLE and t-SNE which does not.

20 CHAPTER 4. RESULTS AND DISCUSSION 4.1. RÖSSLER ATTRACTOR

4.1.2. Return maps In this section we present both the return maps after applying manifold learning on the attractor and after applying it on the points on the Poincaré section. Initially the focus will be on the return maps from the reembedded attractor, and then when applied to points on the Poincaré section. These results were created using MLLE and Hessian eigenmapping with 10 neighbours.

Figure 4.6: The Rössler attractor reembedded in two dimensions using MLLE with 10 neighbours, we can see that the return map for the peak is single-valued.

As seen in figure 4.6, this method provides a single-valued return map for the peak. It does however not provide a single-valued map when the Poincaré section intersects the fold of the attractor as seen in figure 4.7. This this is the same as what happens when regular Poincaré return maps are created from points at the fold of the attractor. In general it seems that overlapping data sets tend to give non single-valued return maps. It is thereby of interest if the non-overlapping reembeddings show the similar tendencies at the fold. It should be noted here that, compared to regular maps, the part of the fold that provide non-single-valued return maps is very small.

Figure 4.7: The Rössler attractor reembedded in two dimensions using MLLE with 10 neighbours, we can see that the return map for the fold is non-single- valued.

In figure 4.8 the results from creating Poincaré return maps of the reembedding from Hessian eigenmapping are presented. As seen, it also provides a single-valued return map for most of the attractor and the problem with non single-valued return maps does not seem as prominent at the fold as in the case of MLLE. Instead, the return map is somewhat distorted and scattered but following the general shape of other return maps. This might reinforce the idea that inconsistent return maps occur when the Poincaré section is applied to a section where the data overlaps. The results are the same when using LTSA.

21 CHAPTER 4. RESULTS AND DISCUSSION 4.1. RÖSSLER ATTRACTOR

Figure 4.8: The leftmost plot is the Rössler attractor reembeded in two dimensions using Hessian eigenmapping with 10 neighbors. The center plot is the return map corresponding to the Poincaré section close to the fold and the rightmost plot is the return map corresponding to the Poincaré section opposite the peak.

The second method that was tried used manifold learning on the points on the Poincaré section, reembedding them into one dimension from which the return map is constructed. In order to compare this with the usual approach in dynamical system literature, standard return maps were also constructed from the Poincaré section. For the Rössler attractor the most common way is to choose the radial and vertical coordinates, given that the Poincaré section is chosen as in figure 2.3. We chose this as we want the trajectories to transversally intersect the Poincaré section.

The examples of manifold learning presented in figures 4.9 and 4.10 were created using MLLE with 10 neighbours. Similar results were achieved using other methods.

Figure 4.9: The first image is the attractor with the points on the Poincaré section marked, the second and third are the return maps of the the radial and vertical coordinates respectively. The last one is the return map after applying manifold learning to Poincaré section. We can clearly see that manifold learning has given us a single valued return map, which was not the case for the radial coordinate.

We can clearly see that using manifold learning does provide single-valued return maps but also, as shown in 4.10, multi-valued-ones. However the cases of non single-valued return maps occur very rarely and primarily when studying different Poincaré sections where the trajectories of the fold return to the circular shape. In general results formed single-valued return maps, even near the fold of the attractor.

Figure 4.10: The first image is the attractor with the points on the Poincaré section marked, the second and third are the return maps of the the radial and vertical coordinates respectively. The last one is the return map after applying manifold learning to Poincaré section. In this case the return map after manifold learning is not single valued.

22 CHAPTER 4. RESULTS AND DISCUSSION 4.2. LORENZ ATTRACTOR

4.2. Lorenz attractor As the Lorenz attractor consists of two lobes that trajectories alternate between, it is more difficult to analyse using Poincaré sections. To avoid this problem the Lorenz attractor was studied using a doubled-polar angle representation, which folds the lobes on top of each other (see section 3.1.3) and thereby avoid parts of the attractor which have proven to be problematic in previous implementations of manifold learning [13].

The data set for the Lorenz attractor (before folding) was generated from simulating the differential equation with initial values x~0 = [26, 10, 8/3] for a time period of t ∈ [0, 400] with a sampling rate of ∆t = 0.04. The first 1000 values were then removed as described in section 3.1.2.

4.2.1. Dimensionality reduction on the attractor Similar to the Rössler attractor the different techniques of dimensionality reduction were applied to the Lorenz attractor. Again k, the number of nearest neighbours was set to 10.

Figure 4.11: The folded and centred Lorenz attractor and two-dimensional reembeddings from using the different dimensionality reduction techniques stated above each plot. The colouring is defined by the radial distance in the original plot, as seen this is preserved in the reembeddings. Observe that dimensionality reduction techniques do not take orientation into consideration and can therefore give differently rotated reembeddings. The scales of the reembeddings vary due to the way the different techniques work, Isomap and PCA preserve distances, locally linear methods normalise the data set and t-SNE has a fixed scale on its axis.

As can be seen in figure 4.11 the different dimensionality reduction techniques give different reembeddings where the key difference is how they represent the intersection of the two lobes of the original attractor. Note that the rotation of the reembeddings is of no importance, since the dimensionality reduction techniques do not take it into account. The scaling of the reembeddings vary due to how the different techniques work, Isomap and PCA preserve distances, locally linear methods normalise the data set and t-SNE has a fixed scale on its axis. Hessian eigenmapping and LTSA both preserve most of the structure of the attractor but, similarly to Rössler, figure 4.5, it contracts at the intersection of the original attractor. Isomap and PCA are similar to a two-dimensional projection of the folded attractor. The similarity between Isomap and PCA indicates that the geodesic and euclidean distances between the points of the attractor do not differ much. In the case of t-SNE the reembedding seems, similar to the Rössler, to be discontinuous and disregard the topology of the attractor. It is believed that modified and standard LLE are the best representation as they take the three-dimensional geometry into consideration, represented by the non-circular shape, but yet preserves the rest of the structure of the attractor.

23 CHAPTER 4. RESULTS AND DISCUSSION 4.2. LORENZ ATTRACTOR

4.2.2. Return maps The Poincaré maps of different Poincaré sections on the Lorenz attractor tend not to be single valued. With the aim to generate a single valued return map the different manifold learning techniques where applied to the points on a Poincaré section of the Lorenz attractor visualised in figure 4.12.

Figure 4.12: The folded and centred Lorenz attractor with a Poincaré section marked in red. The plane that yields the Poincaré section in figure 4.12 is defined by the vector [−sin(π/7), cos(π/7), 0]. On the right side the Poincaré section is plotted using radial and vertical coordinates.

The plane that yields the Poincaré section in figure 4.12 is defined by the vector [−sin(π/7), cos(π/7), 0]. This plane was chosen as it provides a good representation of the Lorenz attractor.

Figure 4.13: Return maps of the folded Lorenz attractor, both directly used on the Poincaré section and after using the different dimensionality reduction techniques stated above each plot.

24 CHAPTER 4. RESULTS AND DISCUSSION 4.3. FITZHUGH-NAGUMO

In figure 4.13 the return maps of the Poincaré section, reembedded using the dimensionality reduction techniques, are visualised along with the original Poincaré return map. It is clearly visible that the return map of the original Poincaré section is more scattered then the rest, this is due to its non-single-valued structure. All dimensionality reduction techniques presented, except Hessian eigenmapping, provide a single-valued return map. However only Isomap, modified LLE and LTSA do this while also preserving the larger distances between outliers in the ends of the original Poincaré map and are consequently assumed to be the best representations of the original data. Note that the return map of the original data has deviating points that can not be found among the points in the Poincaré section. It was first believed as these occurred as a consequence of that the trajectories had yet not converged to the attractor, however the problem remained after removing a substantial number of points in the beginning of the simulation.

The return map can also be created by taking a Poincaré section after the reembedding. In figure 4.14 the result of this for modified LLE is visualised. The general results for the different techniques were that they generated single valued return maps for most of the Poincaré sections defined. However at the fold of the original attractor most techniques generated non single valued return maps. This is probably due to the trajectories in this area intersecting each other.

Figure 4.14: Return map of the folded Lorenz attractor, after using modified LLE and done a Poincaré map of Poincare section in the left plot.

4.3. FitzHugh-Nagumo In this section we will present and discuss results from studying a FitzHugh-Nagumo system consisting of two oscillators. As such a system is four-dimensional, it is problematic to visualise in a comprehensible manner. Fur- thermore the Poincaré section is three-dimensional and could have a more intrinsic structure compared to Lorenz and Rössler, we will therefore discuss manifold learning on the Poincaré section in more detail than before. As the FitzHugh-Nagumo system exhibits extreme events a portion of the analysis will be dedicated to discussing results directly pertaining to them.

4.3.1. Manifold learning on the attractor The results presented in figures 4.15 and 4.16 were created using MLLE with 12 neighbours, and reembedded in two dimensions. PCA and t-SNE were not chosen as they differed from the other techniques when applied to the Rössler and Lorenz attractors. Standard LLE, Hessian eigenmapping and LTSA were not stable during testing on the FitzHugh-Nagumo system and failed due to singular values in the matrices, probably the same problem as with Lorenz, however it is surprising that the same error occurred for standard LLE. Due to these issues we did not test them any further. Isomap is slower than MLLE and due to the large data sets tested, not used.

Because of the rarity of extreme events, large amounts of data are necessary in order to encounter them. This is problematic since the manifold learning techniques are computationally heavy. The results presented in this section

25 CHAPTER 4. RESULTS AND DISCUSSION 4.3. FITZHUGH-NAGUMO were created from a run with a duration of 100 000, with a sampling rate of ∆t = 0.1 which, even if a significantly larger sampling rate than in previous simulations, showed no large impact on the results. The first 1000 values were then removed as described in section 3.1.2. The data were then split into 16 segments to make the computation feasible when applying manifold learning.

Figure 4.15: Example of a reembedded data set with an extreme event. The two graphs to the left are the two oscillators, the graph to the right is the two-dimensional reembedding of the system. Reembedded using MLLE and 12 neighbours.

When extreme events are included, as seen in figure 4.15, the reembedding is not representative of the general structure. This is believed to be a consequence of the trajectories of extreme events being separated from the attractor, which leads to each point only neighbouring other points on its own trajectory. Then the extreme event behaves as a separate structure. This might be resolved by having more extreme events in each set. However, as the largest number of extreme events observed in any segment was two, this is not reasonable. This limitation, along with the fact that required memory increases along with the number of points, makes analysing sets with many extreme events using these techniques extremely difficult. Another way of possibly solving this is by increasing the number of neighbours but the time needed for computation would increase drastically due to the increased number of points. Due to the high probability of it not working efficiently and time constraints this was not attempted.

Figure 4.16: Example of manifold learning on a set of data that does not include an extreme event. The two graphs to the left are the two oscillators, the graph to the right is the two-dimensional reembedding of the system. Reembedded using MLLE and 12 neighbours. This is a representative example of our results.

26 CHAPTER 4. RESULTS AND DISCUSSION 4.3. FITZHUGH-NAGUMO

When not including extreme events, as seen in figure 4.16, the reembedding still has parts disconnected from the main structure. This reembedding is representative of most results acquired. This shows that even without extreme events the reembedding does not preserve the structure of the data.

4.3.2. Manifold learning on the Poincaré section As applying manifold learning directly on the attractor proved to be difficult due to the size of the data set needed, Poincaré sections may be a better approach as it drastically reduces the size of the data set. It is also able to encapsulate more extreme events as analysing longer simulations is no longer unfeasible. Poincaré sections are also easy to define as the FitzHugh-Nagumo trajectories rotate around a single equilibrium point at [0, 0, 0, 0]. The Poincaré section that is presented here is defined as a hyperplane that is orthogonal to the vector v = [0, 1, 0, 1].

The Poincaré Section of a four-dimensional system is three-dimensional. The points on the section can be seen in figure 4.17 where the FitzHugh-Nagumo system was simulated with a duration of 200000, with a sampling rate of ∆t = 0.1. As seen, the points of Poincaré section is separated into two groups, one being a single dimensional curve and the other a cluster of extreme events.

Figure 4.17: The FitzHugh-Nagumo system with the Poincaré section highlighted. In the left image the Poincaré section, reembeded in three dimensions, is presented.

When applied to the points on the Poincaré section both Hessian LLE and LTSA yield the same errors as for the Lorentz attractor and both techniques were dismissed. It is also noted that as t-SNE is computationally expensive and because of previous tendencies to not preserve geometry, it was also dismissed. In the end only standard LLE, MLLE and Isomap where studied.

4.3.3. Return maps In figure 4.18, the Poincaré return maps resulting from applying manifold learning on the Poincaré section are presented. As seen, both Isomap and MLLE show no large differences compared to the traditional return map shown in figure 2.6. The only difference seems to be the mapping of extreme events as they are in bottom of the figure instead of at the top. This is believed to be a consequence of the k-nearest neighbours algorithm not connecting extreme events to the rest of the data set, which gives this difference in mapping.

By contrast, LLE seems to give a vastly different mapping. It seems to separate trajectories with larger amplitudes from ones with smaller amplitude, but the mapping still becomes inconsistent as points become clustered at the bottom of the figure. It is therefore not useful to study this map in the context of extreme events since the extreme events still can be mapped to multiple values.

We also tried to reproduce the results with multiple different approaches, both by varying the amount of neighbours

27 CHAPTER 4. RESULTS AND DISCUSSION 4.3. FITZHUGH-NAGUMO and the size of the data set. All these approaches gave results consistent with figure 4.18.

Figure 4.18: Poincaré return maps from performing manifold learning on the Poincaré section in figure 4.17. No large differences can be observed between these and the standard return map of figure 2.6, as both still have the dual curve structure. This was done using 10 neighbours as input but the results were consistent when changing the input variables. The colouring follows the colour scheme of the Poincaré section in figure 4.17

Initially it was thought that using manifold learning might reveal and separate structures that cause the return maps to be multi-valued. However, as we can see in figure 4.18, this was not the case. This indicates that the return maps are multi-valued due to the local geometry of the points on the Poincaré section rather than their global structure. As manifold learning can not separate these local structures, at least when used in manner that we have used it, other methods are needed. This will be discussed further in section 4.3.4. There are still advantages of using manifold learning on the Poincaré section as it reduces the number of possible scalar functions, making construction of relevant return maps easier. This advantage might be more apparent when studying systems in higher dimensions and is a possible area for future research.

4.3.4. Extreme Events As the results obtained from applying manifold learning did not differ from the results obtained using traditional methods, there does not seem to be an advantage to study extreme events using manifold learning for the FitzHugh-Nagumo system with two coupled oscillators. Instead other possible approaches of study- ing these will be discussed. In general, it can be stated that there seems to be a pattern to extreme events because of the structure of the return map.

As the extreme events were mapped far from the rest of the re- turn map, a reasonable first step is to move them closer to the origin. This is visualised in fig 4.19, where the value 0.032 was subtracted from the extreme events. This is done to ease the analysis of the return map. Figure 4.19: Return map of the FitzHugh-Nagumo system, us- ing the norm as normalising function, before and after moving As points on the Poincaré return map are iterable, it is possible extreme events closer to the origin. Colouring follows the same to determine the trajectories prior to extreme events. Visualised schema as in figure 4.17 in figure 4.20 the prior iterations are presented, here t represents the iteration and i is any iteration during which an extreme event occurs. As seen at t = i − 1 the extreme events originate from a segment on the right side of the return map, which then is split on the two curves. For each previous iteration the segment is split into pairs, probably due to the double-valued nature of the return map, making intervals from which extreme events originate more spread out. It would therefore be of use if the two curves could be separated giving single valued return

28 CHAPTER 4. RESULTS AND DISCUSSION 4.3. FITZHUGH-NAGUMO maps.

Figure 4.20: Return map for the FitzHugh-Nagumo system where sequences before extreme events are highlighted. t = i denotes an iteration with an extreme event, then the following iterations are shown.

As manifold learning is insufficient for separating the two curves, an algorithm for separating curves were con- structed, which is described in section 3.2.5. Studying how trajectories move between these curves, by creating maps of these transitions, could provide an approach for understanding the underlying causes of the multi-valued return maps and thereby the extreme events. This could be tried both for the return map as previously shown, or directly on the Poincaré section before creating return maps. As this approach is outside of the scope of this thesis this was not studied further and instead we propose this as an area for future research.

There is still some value of the results presented. They clearly show that there exists a pattern of where and how extreme events arise. These results could be used to design a monitoring system that could provide warnings of upcoming extreme events as the points of origin can define an area in the attractor. It is also interesting how the results show that extreme events originate from only a small interval on the right part of the return map, as one could assume this would be on the left side as well, due to the symmetry of the return map.

29 5| Conclusions

In this thesis we have used dimensionality reduction techniques in combination with Poincaré sections to analyse chaotic dynamical systems.

Applying manifold learning on the attractors of chaotic dynamical systems is useful but has some limitations. The techniques require large amounts of data, in order to create representative reembeddings, as it is necessary to have trajectories in the entirety of the attractor. This makes this approach computationally heavy. It is also important that the data are evenly distributed along the trajectory, as the reembedding becomes inconsistent otherwise. The techniques tend to provide a good representation of the topology of the attractors, but struggle with self-folding shapes.

When manifold learning was applied to the points on the Poincaré section for Rössler and Lorenz it tended to generate single-valued return maps, even if the traditional return maps were multi-valued. When this approach was applied to the FitzHugh-Nagumo system it failed to generate a single valued return map. This indicates that the multi-valued return maps are a result of the local geometry of the points on the Poincaré section rather than their global structure. As this approach did not generate single-valued return maps it does not provide further insight into extreme events, compared to traditional approaches. It was therefore concluded that other approaches are needed in order to predict extreme events, and we have outlined such an approach.

Manifold learning might still be helpful when analysing high-dimensional chaotic systems, as in this case, the techniques may extract lower-dimensional behaviour that could be useful in order to understand the system. These techniques can also be used to reduce high-dimensional Poincaré sections, in order to make the construction of return maps easier. This could help when analysing the FitzHugh-Nagumo system in higher dimensions.

30 Bibliography

[1] W. Denson. “The history of reliability prediction”. In: IEEE Transactions on reliability 47.3 (1998), SP321–SP328. [2] M. Bordo. “Financial crises, banking crises, stock market crashes, and the money supply: some international evidence, 1980-1933”. In: Financial crises and the world banking system, New York: St. Martin’s (1986). [3] B. I. Jacobs. Capital Ideas and Market Realities: Option Replication, Investor Behavior, and Stock Market Crashes. Wiley-Blackwell, 1999. [4] D. M. Newbery et al. “The role of extreme events in the impacts of selective tropical forestry on erosion during harvesting and recovery phases at Danum Valley, Sabah”. In: Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences 354.1391 (1999), pp. 1749–1761. doi: 10.1098/rstb.1999.0518. [5] T. C. Peterson et al. “Explaining Extreme Events of 2012 from a Climate Perspective”. In: Bulletin of the American Meteorological Society 94.9 (2013), S1–S74. doi: 10.1175/BAMS-D-13-00085.1. [6] C. Oppenheimer. “Climatic, environmental and human consequences of the largest known historic erup- tion: Tambora volcano (Indonesia) 1815”. In: Progress in Physical Geography: Earth and Environment 27.2 (2003), pp. 230–259. doi: 10.1191/0309133303pp379ra. [7] D. Sornette et al. “Rank-ordering statistics of extreme events: Application to the distribution of large earth- quakes”. In: Journal of Geophysical Research: Solid Earth 101.B6 (1996), pp. 13883–13893. doi: 10.1029/96JB00177. eprint: https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/96JB00177. url: https://agupubs. onlinelibrary.wiley.com/doi/abs/10.1029/96JB00177 (visited on 04/16/2019). [8] G. Ansmann et al. “Extreme events in excitable systems and mechanisms of their generation”. In: Phys. Rev. E 88 (5 Nov. 2013), p. 052911. doi: 10.1103/PhysRevE.88.052911. url: https://link.aps.org/doi/10.1103/ PhysRevE.88.052911 (visited on 05/15/2019). [9] P. Cvitanovi´cet al. Chaos: Classical and Quantum. Copenhagen: Niels Bohr Inst., 2016. url: http://ChaosBook. org/ (visited on 03/27/2019). [10] C. Grebogi, E. Ott, and J. A. Yorke. “Chaos, Strange Attractors, and Fractal Basin Boundaries in Nonlinear Dynamics”. In: Science 238.4827 (1987), pp. 632–638. issn: 0036-8075. doi: 10.1126/science.238.4827.632. eprint: http : / / science . sciencemag . org / content / 238 / 4827 / 632 . full . pdf. url: http : / / science . sciencemag.org/content/238/4827/632. [11] E. Alpaydin. Introduction to Machine Learning. Adaptive Computation and Machine Learning series. MIT Press, 2014. isbn: 9780262325752. url: https://books.google.se/books?id=7f5bBAAAQBAJ (visited on 04/20/2019). [12] J. Lu et al. “Multi-Manifold Deep Metric Learning for Image Set Classification”. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). June 2015. [13] E. Bollt. “Attractor modeling and empirical nonlinear model reduction of dissipative dynamical systems”. In: International Journal of Bifurcation and Chaos 17.04 (2007), pp. 1199–1219. [14] G. Teschl. Ordinary differential equations and dynamical systems. Tech. rep. 2004. [15] H. Poincaré. “Sur le problème des trois corps et les équations de la dynamique”. In: Acta mathematica 13.1 (1890), A3–A270. [16] R. Devaney. An introduction to chaotic dynamical systems. Westview press, 2008. [17] P. H. Richter and H-J. Scholz. “Chaos in classical mechanics: The double pendulum”. In: Stochastic phenomena and chaotic behaviour in complex systems. Springer, 1984, pp. 86–97.

31 BIBLIOGRAPHY BIBLIOGRAPHY

[18] E. N. Lorenz. “: does the flap of a butterfly’s wing in Brazil set off a tornado in Texas?” In: (1972). [19] G. Boeing. “Visual Analysis of Nonlinear Dynamical Systems: Chaos, , Self-Similarity and the Limits of Prediction”. In: Systems 4.4 (2016). issn: 2079-8954. doi: 10.3390/systems4040037. url: http://www.mdpi. com/2079-8954/4/4/37 (visited on 05/01/2019). [20] G. Boeing. “Visual Analysis of Nonlinear Dynamical Systems: Chaos, Fractals, Self-Similarity and the Limits of Prediction”. In: Systems 4 (Nov. 2016), p. 37. doi: 10.3390/systems4040037. [21] D. Ruelle and F. Takens. “On the nature of turbulence”. In: Les rencontres physiciens-mathématiciens de Strasbourg- RCP25 12 (1971), pp. 1–44. [22] J. Gleick. Chaos: Making a new science. Open Road Media, 2011. [23] E. N. Lorenz. “Deterministic Nonperiodic Flow.” In: Journal of Atmospheric Sciences 20 (Mar. 1963), pp. 130–148. doi: 10.1175/1520-0469(1963)020<0130:DNF>2.0.CO;2. [24] P. Grassberger and I. Procaccia. “Characterization of strange attractors”. In: Physical review letters 50.5 (1983), p. 346. [25] O. E. Rössler. “An equation for continuous chaos”. In: Physics Letters A 57 (July 1976), pp. 397–398. doi: 10.1016/0375-9601(76)90101-8. [26] P. Érdi and J. Tóth. Mathematical models of chemical reactions: theory and applications of deterministic and stochastic models. Manchester University Press, 1989. [27] S. Resnick. Extreme values, regular variation and point processes. Springer, 2013. [28] W. Schlei et al. “Enhanced visualization and autonomous extraction of Poincare map topology”. In: The Journal of the Astronautical Sciences 61.2 (2014), pp. 170–197. [29] W. Tucker. “Computing accurate Poincaré maps”. In: Physica D: Nonlinear Phenomena 171.3 (2002), pp. 127–137. [30] K. M. Carroll. “A Review of Return Maps for Rössler and the Complex Lorenz”. In: (2012). [31] S. Mukherjee, S. K. Palit, and DK. Bhattacharya. “Is one dimensional return map sufficient to describe the chaotic dynamics of a three dimensional system?” In: arXiv preprint arXiv:1409.6738 (2014). [32] J. Llibre and A. E. Teruel. “Return Maps”. In: Introduction to the Qualitative Theory of Differential Systems: Planar, Symmetric and Continuous Piecewise Linear Systems. Basel: Springer Basel, 2014, pp. 119–187. isbn: 978-3-0348- 0657-2. doi: 10.1007/978-3-0348-0657- 2_4. url: https://doi.org/10.1007/978-3- 0348-0657- 2_4 (visited on 05/03/2019). [33] R. Fitzhugh. “Impulses and Physiological States in Theoretical Models of Nerve Membrane”. In: Biophysical Journal 1 (July 1961), pp. 445–466. doi: 10.1016/S0006-3495(61)86902-6. [34] F. Christiansen, P. Cvitanovic, and V. Putkaradze. “Spatiotemporal chaos in terms of unstable recurrent pat- terns”. In: Nonlinearity 10.1 (Jan. 1997), pp. 55–70. doi: 10.1088/0951-7715/10/1/004. url: https://doi. org/10.1088%2F0951-7715%2F10%2F1%2F004 (visited on 05/15/2019). [35] J. A. Lee and M. Verleysen. Nonlinear dimensionality reduction. Springer, 2007. [36] J. P. Cunningham and Z. Ghahramani. “Linear dimensionality reduction: Survey, insights, and generaliza- tions”. In: The Journal of Machine Learning Research 16.1 (2015), pp. 2859–2900. [37] L. J. Williams H. Abdi. Computational Statistics. Hoboken, New Jersey: John Wiley & Sons, Inc, 2010. url: https://onlinelibrary.wiley.com/doi/pdf/10.1002/wics.101 (visited on 04/07/2019). [38] C. J. F. Ter Braak. “Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis”. In: Ecology 67.5 (1986), pp. 1167–1179. [39] S. T. Roweis and L. K. Saul. “Nonlinear dimensionality reduction by locally linear embedding”. In: science 290.5500 (2000), pp. 2323–2326. [40] Z. Zhang and H. Zha. “Principal manifolds and nonlinear dimensionality reduction via tangent space align- ment”. In: SIAM JOURNAL ON SCIENTIFIC COMPUTING (2004), pp. 313–338. [41] J. C. Langford J. B. Tenenbaum V. de Silva. “A Global Geometric Framework for Nonlinear Dimensionality Reduction”. In: SCIENCE 290.5500 (2000), pp. 2319–2323. [42] R. Karbauskaite,˙ G. Dzemyda, and V. Marcinkeviˇcius.“Dependence of locally linear embedding on the reg- ularization parameter”. In: TOP 18.2 (Dec. 2010), pp. 354–376. issn: 1863-8279. doi: 10.1007/s11750-010- 0151-y. url: https://doi.org/10.1007/s11750-010-0151-y (visited on 05/16/2019).

32 BIBLIOGRAPHY BIBLIOGRAPHY

[43] Z. Zhang and J. Wang. “MLLE: Modified Locally Linear Embedding Using Multiple Weights.” In: vol. 19. Jan. 2006, pp. 1593–1600. [44] D. Donoho and C. Grimes. “Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc. National Academy of Science (PNAS), 100, 5591-5596”. In: Proceedings of the National Academy of Sciences of the United States of America 100 (June 2003), pp. 5591–6. doi: 10.1073/pnas.1031596100. [45] T. Zhang et al. “Linear local tangent space alignment and application to face recognition”. In: Neurocomputing 70.7-9 (2007), pp. 1547–1553. [46] L. Maaten and G. Hinton. “Visualizing data using t-SNE”. In: Journal of machine learning research 9.Nov (2008), pp. 2579–2605. [47] F. R. Helmert. “Die Genauigkeit der Formel von Peters zur Berechnung des wahrscheinlichen Beobachtungs- fehlers director Beobachtungen gleicher Genauigkeit”. In: Astronomische Nachrichten 88 (June 1876), p. 113. doi: 10.1002/asna.18760880802. [48] S. Kullback and R. A. Leibler. “On information and sufficiency”. In: Ann. Math. Statistics 22 (1951), pp. 79–86. issn: 0003-4851. doi: 10.1214/aoms/1177729694. url: https://doi.org/10.1214/aoms/1177729694. [49] M. Wattenberg, F. Viégas, and I. Johnson. “How to use t-SNE effectively”. In: Distill 1.10 (2016), e2. [50] A. Esteva et al. “Dermatologist-level classification of skin cancer with deep neural networks”. In: Nature 542.7639 (2017), p. 115. [51] S. H. Strogatz. Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry and Engineering. Westview Press, 2000.

33 A| Code

A.1. Equation from abc import ABC, abstractmethod class Function(ABC): """Abstract class for defining equations so that they can be simulated using the Simulator class."""# Is ←- this actually abstract @abstractmethod def derivatives(self,t,state): """Abstract method for defining functions.""" pass

@abstractmethod def inCond(self): """Abstract method for defining initial conditions.""" pass class Lorenz(Function): """The Lorenz system defined such that it can be simulated using the Simulator class."""

def __init__(self,values = [26.0,10.0,8.0/3.0]): """Initialises the system. The values defined here are the ones used in the original paper. This will, ←- when simulated using the simulator class, generate the characteristic butterfly shape.""" self.rho = values[0] self.sigma = values[1] self.beta = values[2]

def derivatives(self,t,state): """The derivatives of the Lorenz system.""" x, y, z = state return self.sigma * (y - x), x * (self.rho - z) - y, x * y - self.beta * z

def inCond(self): """Predefined initial conditions that can be used, [1.0,1.0,1.0].""" return [1.0,1.0,1.0] class Rossler(Function): def __init__(self, values = [0.1,0.1,14.0]): """Initialises the system. The values defined here are the ones used in the original paper describing ←- the system. This will, when simulated using the simulator class, generate the characteristic ←- butterfly shape.""" self.a = values[0] self.b = values[1] self.c = values[2]

def derivatives(self, t, state): """The derivatives of the Rossler system.""" x, y, z = state# unpack the state vector return -y-z, x+self.a*y,self.b+z*(x-self.c)

def inCond(self): """Predefined initial conditions that can be used, [1.0,1.0,1.0].""" return [1.0,1.0,1.0]

A.2. Simulator import numpy as np import math from scipy.integrate import solve_ivp

34 APPENDIX A. CODE A.2. SIMULATOR

import Equation from scipy.interpolate import splprep as spl from scipy.interpolate import splev class Simulator: """A class that simulates the trajectory ofa chaotic system. Uses instances of the Equation class as an ←- input.""" def __init__(self,equation,init = None): """Initalises the simulator, requires the input of an Equation from the Equation class. This class ←- requires predefined initial conditions.""" self.equation = equation if init is None: # If new initial conditions are not defined, the predefined initial conditions are used self.state0 = equation.inCond() else: self.state0 = init

def simulate(self, duration = 40, samplingRate = 0.01, interpolate = False, method = 'RK45 '):#TODO add ←- tolerances """Simulates the equation using solve_ivp from the scipy.integrate package, if the method is not defined ←- it defaults to Runge-Kutta 45. The duration is the time that the system is simulated for, if it is ←- not defined it defaults to 40. The samplingRate is the time between samples of the equation, if this ←- is not defined it defaults to 0.01. If interpolate is set to true it calls interpolateCurve and ←- returns points separated bya fixed distance rather thana fixed time.""" # Defines set times for sampling the equation. self.t = np.arange(0.0, duration, samplingRate) # Integrates the equation with the given parameters and method. tmp = solve_ivp(self.equation.derivatives,[0, duration],self.state0,method = method,t_eval=self.t) self.data = np.array(tmp.y).transpose() if interpolate: # Interpolates the data directly self.interpolateCurve() return self.data

def storeData(self, filename = None): """Saves the data from the integration in two files, one with the coordinates of the points and one with ←- the sample times.""" np.savetxt(filename+'Values.txt ',self.data) np.savetxt(filename+'Time.txt ',self .t)

def loadData(self, filename = None): """Loads data from two files, one with the coordinates of the points and one with the sample times.""" self.data = np.loadtxt(filename+'Values.txt ') self.t = np.loadtxt(filename+'Time.txt ')

def getSampleTimes(self): """Returns the time the system was sampled.""" return self.t

def getData(self): """Returns the data of the curve.""" return self.data

def interpolateCurve(self): """Interpolates the curve in order to evenly distribute the points. This uses the data generated by the ←- simulate method, if the system has not been simulated the function simulates it using the standard ←- parameters of the simulate method.""" # Get the data and organise it for the method if self.data is None: self.simulate() values = np.array(self.data) np.swapaxes(values,1,0)

# Createsa new variables # this is the geodesic distance along the trajectory of the curve, each value is the cumuluative some of ←- all previous distances s = [] lastPoint = values[:,0] lastS = 0 for i, point in enumerate(values[0]): point = values[:,i] lastS = math.sqrt(np.dot(np.array(point-lastPoint),np.array(point-lastPoint)))+lastS s.append(lastS) lastPoint = point

# the intention of the method is to replace the non.evenly-spaceds with the evenly-spaced s2 s2 = np.linspace(s[0],s[-1],len(s)) interpedData = []

# Interpolates each dimension in turn as there are limits to the number of dimensions python can ←- interpolate at the same time for dim in range(0,len(values[:,1])): # Preparations for interpolating each dimension

35 APPENDIX A. CODE A.3. POINCAREMAPPER

data = np.array(values[dim,:]).reshape(len(s),1).squeeze() dimData = [] s2Index = 0 lastIndex = 0

# Loops through the data, selecting six points ata time for i, value in enumerate(data): ifi%5==0 and i+5 <= len(s): # special case for the end of the data as the number of points might not be divisible by6 if i+5 == len(s): tck, u = spl(np.array(data[i-2:i+4]).reshape(1,6),u = np.array(s[i-2:i+4]).reshape(6,1).←- flatten(),k = 5) newValues = splev(s2[lastIndex:-1],tck) for xi in newValues: for xij in xi: dimData.append(xij) else: # Prepares the interpolation in the general case tck, u = spl(np.array(data[i:i+6]).reshape(1,6),u = np.array(s[i:i+6]).reshape(6,1).←- flatten(),k = 5) # Finds the values of s2 that are within the interval that will be interpolated while s2[s2Index] < s[i+5]: s2Index +=1 # Generates the new values and appends them to the output, if there are no new points in ←- the interval nothing is added if s2Index != lastIndex: newValues = splev(s2[lastIndex:s2Index],tck) for xi in newValues: for xij in xi: dimData.append(xij) lastIndex = s2Index interpedData.append(dimData) interpedData = np.array(interpedData) return interpedData.transpose()

A.3. PoincareMapper import numpy as np from math import sin import math from scipy.integrate import odeint from scipy import signal from scipy.interpolate import splprep as spl from scipy.interpolate import splev class PoincareMapper: """Calculates the points alonga given trajectory that intersecta Poincare section, defined bya hyperplane ←- passing through the origin, ina given direction.""" def __init__(self,plane,data, direction = 1): """The required inputs are the data set and the normal of the plane. The direction specifies if ←- intersections are counted if the path of the trajectory is the same as the normal(positive value) ←- or opposite the normal(negative value).""" self.data = data # The normal of the plane, goes through the origin. self.plane = plane/np.linalg.norm(plane) vec = self.plane[np.newaxis, :] self.proMatrix = np.identity(len(plane))-np.matmul(vec.T,vec)#TODO, is this necessary and if not, ←- remove the planePoints method # Specify direction of intersection, positive or negative self.direction = direction self .map()

def crossing(self,x1,x2): """Returns True if two points are on different sides of the Poincare section.""" return np.dot(x1,self.plane)*np.dot(x2,self.plane)<0

def disPlan(self,x): """Calculates the distance betweena point and the plane.""" return abs(np.dot(x,self.plane))

def map(self): """Computes the points on the Poincare section.""" self.values = [] # Lists the indexes preceding the intersections self.intersectindx = [] fori in range(2,len(self.data)-4): x1 = self.data[i] x2 = self.data[i+1]

36 APPENDIX A. CODE A.4. FITZSIMULATOR

# If two consecutive points are on different sides of the Poincare section and the direction is as ←- specifieda closer crossing is calculated if(self.crossing(x1,x2) and (np.dot(self.plane,x2)*self.direction > 0)): # The six points surrounding the intersection are chosen and input in the interpolate method nearestPoint = self.interpolate(self.data[i-3:i+3]) self.intersectindx.append(i) self.values.append(nearestPoint) return np.asarray(self.values)

def interpolate(self, crossingPoints): """Interpolates six points in order to finda point closer to the intersection.""" points = crossingPoints.transpose() # Interpolating the curve, finding 1000 points interpedCurve = [] fori in range(0,np.array(points).shape[0]): s = np.linspace(points[i,0],points[i,5],6) tck,u = spl([s,points[i,:]],k=5) curve = splev(np.linspace(0,1,1000),tck) curve = np.array(curve).transpose() interpedCurve.append(curve[:,1].transpose()) interpedCurve = np.array(interpedCurve) min = self.disPlan(interpedCurve[:,0]) minPoint = 0 fori in range(0,len(interpedCurve)): point = interpedCurve[:,i] tmp = self.disPlan(point) if(min>tmp): min = tmp minPoint = point return minPoint

def getValues(self): """Returns the values of the points on the Poincare section.""" return self.values.copy()

def getIntersctIndx(self): """Returns the indexes of the points preceding intersections.""" return self.intersectindx.copy()

def getPlaneNorm(self): """Returns the normal of the plane.""" return self.plane.copy()

A.4. FitzSimulator from scipy.integrate import odeint import matplotlib.pyplot as plt import numpy as np import math from scipy.interpolate import splprep as spl from scipy.interpolate import splev class FitzSimulator: """Simulatesa FitzHugh-Nagumo system withn coupled oscillators."""

def __init__(self,inCond = None, n=2,a = -0.025794,c=0.02,b = None,A = None,k=0.128): """Takes user defined input for the FitzHugh-Nagumo system. If no user input is given the values default to two oscillators and the values used in the paper by ←- Ansmann. And all oscillators are then coupled with each other and not with themselves. The output for each point isa vector where the firstn values are thex-values for each oscillator and ←- the followingn values are they-values.""" self .n = n self .k = k if(isinstance(a,float)): self.a = a*np.ones(n) else: self .a = a, if(isinstance(c,float)): self.c = c*np.ones(n) else: self .c = c if(A is None): self.A = np.ones((n,n))-np.identity(n) if(b is None): # Different values forb are required ifn=2 and if it is greater than2 in order for there to be ←- chaotic behaviour if n == 2: self.b = np.array([0.0065, 0.0135])

37 APPENDIX A. CODE A.4. FITZSIMULATOR

else: self.b = np.linspace(0.006,0.014,n) else: self .b = b; if inCond is None: self.inCond = 0.01001*np.ones(2*self.n) else: self.inCond = inCond

def simulate(self, duration = 1000, samplingRate = 0.1, rtol = 1.49012e-8, atol = 1.49012e-8): """Simulates the FitzHugh-Nagumo system using the odeint function from the scipy.integrate package. The ←- duration is the time that the system is simulated for, if it is not defined it defaults to 1000. The ←- samplingRate is the time between samples of the equation, if this is not defined it defaults to ←- 0.01.""" self.t = np.arange(0,duration, samplingRate) self.data = odeint(self.derivatives,self.inCond,self.t, rtol = rtol, atol = atol) return self.data

def derivatives(self,state,t): """Defines the derivatives of FitzHugh-Nagumo system and returns the value of them at every point.""" x = state[0:self.n] y = state[self.n:2*self.n] xdot = np.multiply(x,np.multiply(self.a-x,x-np.ones(self.n)))-y+self.k*(np.matmul(self.A,x)-(self.n-1)*x←- ) y = np.multiply(self.b,x)-np.multiply(self.c,y) state[0:self.n] = xdot state[self.n:2*self.n] = y return state

def interpolateCurve(self): """Interpolates the curve in order to evenly distribute the points. This uses the data generated by the ←- simulate method, if the system has not been simulated the function simulates it using the standard ←- parameters of the simulate method.""" # Get the data and organise it for the method if self.data is None: self.simulate() values = np.array(self.data)

# Createsa new variables that is the geodesic distance along the trajectory of the curve, each value ←- is the cumuluative some of all previous distances s = [] lastPoint = values[0,:] lastS = 0 for i, point in enumerate(values[:,0]): point = values[i,:] lastS = math.sqrt(np.dot(np.array(point-lastPoint),np.array(point-lastPoint)))+lastS s.append(lastS) lastPoint = point

# the intention of the method is to replace the non.evenly-spaceds with the evenly-spaced s2 s2 = np.linspace(s[0],s[-1],len(s)) interpedData = []

# Interpolates each dimension in turn as there are limits to the number of dimensions python can ←- interpolate at the same time for dim in range(0,len(values[1,:])): # Preparations for interpolating each dimension data = np.array(values[:,dim]).reshape(len(s),1).squeeze() dimData = [] s2Index = 0 lastIndex = 0 # Loops through the data, selecting six points ata time for i, value in enumerate(data): ifi%5==0 and i+5 <= len(s): # special case for the end of the data as the number of points might not be divisible by6 if i+5 == len(s): tck, u = spl(np.array(data[i-2:i+4]).reshape(1,6),u = np.array(s[i-2:i+4]).reshape(6,1).←- flatten(),k = 5) newValues = splev(s2[lastIndex:-1],tck) for xi in newValues: for xij in xi: dimData.append(xij) else: # Prepares the interpolation in the general case tck, u = spl(np.array(data[i:i+6]).reshape(1,6),u = np.array(s[i:i+6]).reshape(6,1).←- flatten(),k = 5) # Finds the values of s2 that are within the interval that will be interpolated while s2[s2Index] < s[i+5]: s2Index +=1 # Generates the new values and appends them to the output, if there are no new points in ←- the interval nothing is added if s2Index != lastIndex: newValues = splev(s2[lastIndex:s2Index],tck)

38 APPENDIX A. CODE A.5. CURVESEPARATOR

for xi in newValues: for xij in xi: dimData.append(xij) lastIndex = s2Index interpedData.append(dimData) interpedData = np.array(interpedData) return interpedData.transpose()

A.5. CurveSeparator import matplotlib.pyplot as plt import numpy as np from mpl_toolkits.axes_grid1.inset_locator import zoomed_inset_axes class CurveSeparetor: ''' Saves the input data and the other parameters if defined ''' def __init__(self, data): self.data = data self.tol_merged = 0.00018# Tolerance for how close the graphs need to be to be inseparable self.tol_middlePoint = 0.1# Tolerance for how large intervals we look for the max value of the lower ←- graph self.tol_int = 0.0005# Tolerance for how large intervals we look for points when constructing the ←- middle sepectory self.nbr_int = 60# Number of sections that the graphs are devided into

def ConstructGraph(self, sepRight, sepLeft, incRight, incLeft): ''' Return an array were the element on the index of the point is1 if it belongs to the lower curve and ←- 0 otherwise ''' lowerCurve = np.zeros(len(self.data[:,1])) # Right branch fori in range(0,len(sepRight)-1): # Add all points on the right branch in the merged section to the lower curve indexPoints = np.where((self.data[:,1] <= sepRight[i,1]) & (self.data[:,1] >= sepRight[i+1,1]) & (←- self.data[:,0] > sepRight[i,0]-0.005) & (self.data[:,0] > sepRight[0,0])) if self.isMergedRight(incRight, sepRight, i): lowerCurve[indexPoints] = 1 # Add all points below the right separator to the lower curve else: for point in indexPoints[0]: if self.data[point,1] < sepRight[i,1] + incRight[i] * (self.data[point,0] - sepRight[i,0]): lowerCurve[point] = 1 # Left branch fori in range(0,len(sepLeft)-1): # Add all points on the left branchin the merged section to the lower curve indexPoints = np.where((self.data[:,1] <= sepLeft[i,1]) & (self.data[:,1] >= sepLeft[i+1,1]) & (self←- .data[:,0] < sepLeft[i,0]+0.005) & (self.data[:,0] < sepLeft[0,0])) if self.isMergedLeft(incLeft, sepLeft, i): lowerCurve[indexPoints] = 1 # Add all points below the left separator to the lower curve else: for point in indexPoints[0]: if self.data[point,1] < sepLeft[i,1] + incLeft[i] * (self.data[point,0] - sepLeft[i,0]): lowerCurve[point] = 1 below = np.where((self.data[:,1] < sepLeft[i,1]) & (self.data[:,1] < sepLeft[i,1])) lowerCurve[below] = 1 return lowerCurve

def distFromSep(self, inclination, separator, point, i): ''' Calculate distance from point to the closest point on the separator ''' x = (point[1] + point[0]/inclination[i] - separator[i,1] + inclination[i]*separator[i,0])/(inclination[i←- ] + 1/inclination[i]) y = separator[i,1] + inclination[i] * (x - separator[i,0]) return np.sqrt((x - point[0])**2 + (y - point[1])**2)

def pointsAboveSeparator(self, inclination, separator,i): ''' Returns the inidices og all points above the separator ''' return self.data[:,1] >= inclination[i] * (self.data[:,0] - separator[i,0]) + separator[i,1]

def pointsBelowSeparator(self, inclination, separator, i): ''' Returns the indices of all points below the separator ''' return self.data[:,1] <= inclination[i] * (self.data[:,0] - separator[i,0]) + separator[i,1]

def isMergedRight(self, inclination, separator, i): ''' Returns true if the points mean of the points on both side of the right separator segment is closer ←- to the separator than tol_merged ''' top = np.where((self.data[:,0] > separator[i-1,0]) & (self.data[:,1] < separator[i-1,1]) & (self.data←- [:,1] > separator[i,1]) & self.pointsAboveSeparator(inclination, separator, i))

39 APPENDIX A. CODE A.5. CURVESEPARATOR

bottom = np.where((self.data[:,0] > separator[i-1,0]) & (self.data[:,1] < separator[i-1,1]) & (self.data←- [:,1] > separator[i,1]) & self.pointsBelowSeparator(inclination, separator, i))

if np.size(top) < 1 or np.size(bottom) < 1: if np.size(top) > 1: bottom = top elif np.size(bottom) > 1: top = bottom else: return False

topMean = [np.mean(self.data[top,0]), np.mean(self.data[top,1])] bottomMean = [np.mean(self.data[bottom,0]), np.mean(self.data[bottom,1])]

return self.distFromSep(inclination, separator, topMean, i-1) < self.tol_merged and self.distFromSep(←- inclination, separator, bottomMean, i-1) < self.tol_merged

def isMergedLeft(self, inclination, separator, i): ''' Returns true if the points mean of the points on both side of the right separator segment is closer ←- to the separator than tol_merged '''

top = np.where((self.data[:,0] < separator[i-1,0]) & (self.data[:,1] < separator[i-1,1]) & (self.data←- [:,1] > separator[i,1]) & self.pointsAboveSeparator(inclination, separator, i)) bottom = np.where((self.data[:,0] < separator[i-1,0]) & (self.data[:,1] < separator[i-1,1]) & (self.data←- [:,1] > separator[i,1]) & self.pointsBelowSeparator(inclination, separator, i))

if np.size(top) < 1 or np.size(bottom) < 1: if np.size(top) > 1: bottom = top elif np.size(bottom) > 1: top = bottom else: return False

topMean = [np.mean(self.data[top,0]), np.mean(self.data[top,1])] bottomMean = [np.mean(self.data[bottom,0]), np.mean(self.data[bottom,1])]

return self.distFromSep(inclination, separator, topMean, i-1) < self.tol_merged and self.distFromSep(←- inclination, separator, bottomMean, i-1) < self.tol_merged

def findTopOfLowerCurve(self, sections): ''' Finds one of the points on the top of the lower curve ''' fori in range(1,len(sections)): sections = sections[::-1] # Find the points furthest to the right an left for each section indexMax = int(np.argmax(np.where((self.data[:,1] > sections[i]) & (self.data[:,1] < sections[i-1]), ←- self.data[:,0], -np.Inf))) indexMin = int(np.argmin(np.where((self.data[:,1] > sections[i]) & (self.data[:,1] < sections[i-1]), ←- self.data[:,0], np.Inf))) right = self.data[indexMax,0] left = self.data[indexMin,0] dist = (right - left) /2 # Checks if there is any points between the upper curve and choses the maximum of them middlePoints = np.argmax(np.where((self.data[:,0] left + dist - self.tol_middlePoint * dist) & (self.data[:,1] > sections[i]) & (←- self.data[:,1] < sections[i-1]), 1, 0)) # Returns the middle points with largesty-value, if there exist any if middlePoints > 0: #plt.scatter(self.data[middlePoints,0], self.data[middlePoints,1],c= 'r ') return middlePoints, i def constructSeparators(self): ''' Constructs one right and one left separator to separate the lower and upp curve from each other '''

indexMaxTopGraph = np.argmax(self.data[:,1])

# Constructa uniformly distributed interval from the highest to the lowest point sections = np.linspace(np.min(self.data[:,1]), np.max(self.data[:,1]), self.nbr_int)[::-1]

[startIndexLowerGraph, startSection] = self.findTopOfLowerCurve(sections)

countRight = 1 countLeft = 1

# Separators for the right and left branch sepRight = np.zeros((self.nbr_int-startSection+1,2)) sepLeft = np.zeros((self.nbr_int-startSection+1,2))

# The inclination in each section incRight = np.zeros(self.nbr_int-startSection) incLeft = np.zeros(self.nbr_int-startSection)

# Seta value between the max values of the two graphs as start values

40 APPENDIX A. CODE A.5. CURVESEPARATOR

startPointX = self.data[startIndexLowerGraph,0] + (self.data[indexMaxTopGraph,0] - self.data[←- startIndexLowerGraph,0]) / 20 startPointY = self.data[startIndexLowerGraph,1] + (self.data[indexMaxTopGraph,1] - self.data[←- startIndexLowerGraph,1]) / 20 xPrevRight = startPointX xPrevLeft = startPointX sepRight[0,:] = [startPointX, startPointY] sepLeft[0,:] = [startPointX, startPointY]

fori in range(startSection, self.nbr_int-1): # Finda all points right = np.where((self.data[:,0]>xPrevRight) & (self.data[:,1] > sections[i] - self.tol_int) & (self←- .data[:,1] < sections[i] + self.tol_int)) left = np.where((self.data[:,0] sections[i] - self.tol_int) & (self.←- data[:,1] < sections[i] + self.tol_int))

if np.size(right) > 0: right = right[0] indexTopRight = np.argmax(self.data[right,0]) indexBottomRight = np.argmin(self.data[right,0]) if indexTopRight != indexBottomRight: sepRight[countRight,0] = (self.data[right[indexTopRight],0] +self.data[right[←- indexBottomRight],0])/2 sepRight[countRight,1] = (self.data[right[indexTopRight],1] + self.data[right[←- indexBottomRight],1])/2

incRight[countRight-1] = (sepRight[countRight,1] - sepRight[countRight -1,1]) / (sepRight[←- countRight ,0] - sepRight[countRight -1,0]) xPrevRight = self.data[right[indexBottomRight],0] countRight = countRight + 1

if np.size(left) > 0:

left = left[0] indexTopLeft = np.argmax(self.data[left,0]) indexBottomLeft = np.argmin(self.data[left,0]) if indexTopRight != indexBottomRight: sepLeft[countLeft,0] = (self.data[left[indexTopLeft],0] + self.data[left[indexBottomLeft←- ] ,0]) /2 sepLeft[countLeft,1] = (self.data[left[indexTopLeft],1] + self.data[left[indexBottomLeft←- ] ,1]) /2

incLeft[countLeft-1] = (sepLeft[countLeft,1] - sepLeft[countLeft-1,1]) / (sepLeft[countLeft ←- ,0] - sepLeft[countLeft-1,0]) xPrevLeft = self.data[left[indexTopLeft],0] countLeft = countLeft + 1

sepRight[countRight] = self.data[np.argmax(self.data[:,0])] sepLeft[countLeft] = self.data[np.argmin(self.data[:,0])] incRight[countRight-1] = (sepRight[countRight,1] - sepRight[countRight -1,1]) / (sepRight[countRight ,0]←- - sepRight[countRight -1,0]) incLeft[countLeft-1] = (sepLeft[countLeft,1] - sepLeft[countLeft-1,1]) / (sepLeft[countLeft,0] - sepLeft←- [countLeft -1,0])

return sepRight[:countRight+1], sepLeft[:countLeft+1], incRight[:countRight+1], incLeft[:countLeft+1]

def SeperateGraphs(self): ''' Separets the different curves, returns an array where the symbol on the index ofa point is"l" if ←- the point belongs to the lower curve and"u" if it belongs to the upper curve ''' [sepRight, sepLeft, incRight, incLeft] = self.constructSeparators()

#1 if data pointj belongs to graphi, zero otherwise.i in nbr graphs,j in nbr datapoints graphMatrix = np.zeros((2,len(self.data)))

# Construct lower graph lowerGraph = self.ConstructGraph(sepRight, sepLeft, incRight, incLeft) symbols = np.where(lowerGraph>0,"l","u")

return symbols

41