Comparison of Systems Using Diffusion Maps

Proceedings of the 44th IEEE Conference on Decision and Control, and ThC07.2 the European Control Conference 2005 Seville, Spain, December 12-15, 2005 Comparison of Systems using Diffusion Maps Umesh Vaidya, Gregory Hagen, Stephane´ Lafon, Andrzej Banaszuk, Igor Mezic,´ Ronald R. Coifman Abstract— In this paper we propose an efficient method of data. The idea in [13] is to use the eigenvectors of Frobenius- comparing data sets obtained from either an experiment or Perron operator to analyze the spectral characteristics of the simulation of dynamical system model for the purpose of model data. These approach captures both the geometry and the validation. The proposed approach is based on comparing the intrinsic geometry and the associated dynamics linked with dynamics linked with the data set. In [14] [13] the method the data sets, and requires no a priori knowledge of the was applied to the comparison of a noise-driven combustion qualitative behavior or the dimension of the phase space. The process with experimental data [8]. The time-series data approach to data analysis is based on constructing a diffusion from both the model and experiment was obtained from a map defined on the graph of the data set as established in single pressure measurement. The system observable was a the work of Coifman, Lafon, et al.[5], [10]. Low dimensional embedding is done via a singular value decomposition of the collection of indicator functions covering the phase space. approximate diffusion map. We propose some simple metrics The essential ingredients of the metric employed in [14] constructed from the eigenvectors of the diffusion map that were a component that captured the spatial, or geometric, describe the geometric and spectral properties of the data. properties of the system, and a component that captured the The approach is illustrated by comparison of candidate models temporal, or spectral, properties of the system. to data of a combustion experiment that shows limit-cycling acoustic oscillations. The theoretical basis that we apply in this paper is established in [5], [10]. In these works, a framework for structural I. INTRODUCTION multiscale geometry, organization of graphs, and subsets of RN is provided. They have shown that by appropriately This paper is concerned with the comparison of dynamical selecting the eigenfunctions of the diffusion maps, which de- systems data for the purpose of model validation. In many scribe local transitions, can lead to macroscopic descriptions cases, the minimum number of independent variables that at different scales. We employ the algorithm developed in are required to describe the approximate behavior of the [10] to obtain the intrinsic geometry of the data set sampled underlying dynamical system is often much smaller than from the dynamical system. the dimensionality of the data itself. This is indeed true This paper is organized as follows. In section II, we in many cases of data obtained from fluid flows (see e.g. present the general background of diffusion maps defined on [6]) and dimensionality reduction in dynamical systems by sets of data, as established in [10]. In section III we describe proper orthogonal decomposition [7] has been extensively how the eigenfunctions of the diffusion map are used in studied. Other methods of dimensionality reduction of data constructing metrics that capture the spatial and temporal include principal component analysis, or PCA, (see e.g. [9]), properties of the data sets. In section IV we apply the metrics kernel PCA [17], PCA with multidimensional scaling [18], to compare candidate model data with experimental data. Laplacian eigenmaps [3], and locally linear embedding [16]. These examples illustrate the necessity of including both In this work we present an efficient means of comparing spatial and temporal components of the system observable dynamical systems data for the purpose of model validation. in accurately comparing data. This work on comparing dynamical system data is inspired by [14]. In this work, a formalism for comparing asymptotic II. DIFFUSION MAPS AND DIFFUSION METRIC dynamics resulting from different dynamical system models In this section we describe in brief the construction of is provided. The formalism is based on the spectral properties the diffusion map defined on the data set. This material in of the Koopman operator. A similar approach based on this section is from [5],[10]. Given a set of data points Γ Frobenius-Perron operator (adjoint to Koopman operator construct a weighted function k(x, y) for x, y ∈ Γ. k(x, y) [12]) formulism is developed in [13]. In [14], the idea is to is also called as kernel and satisfies the following properties: construct harmonic averages (in addition to time averages) • k is symmetric: k(x, y)=k(y, x) and use these to analyze the spectral characteristics of the • k is positivity preserving: for all x and y in data set X, k(x, y) ≥ 0 U. Vaidya, G. Hagen, and A. Banaszuk are with United Technologies Research Center, 411 Silver Lane, East Hartford, CT. 06108 • k is positive semi-definite: for all real-valued bounded [email protected], [email protected], functions f defined on X. [email protected] S. Lafon, research associate, and R.R. Coifman, Professor, are with the Department of Mathematics and Computer Science, Yale University, 51 k(x, y)f(x)f(y)dµ(x)dµ(y) ≥ 0 Prospect St., New Haven, CT 06520. [email protected] X X I. Mezic,´ Professor, Department of Mechanical Engineering, µ X University of California Santa Barbara, Santa Barbara CA 93106 where is the probability measure on . The kernel [email protected] k(x, y) measures the local connectivity of the data points and 0-7803-9568-9/05/$20.00 ©2005 IEEE 7931 hence captures the local geometry of the data points. Several III. COMPARISON OF TWO DATA SETS choices for the kernel k are possible all leading to different analyses of data. The idea behind the diffusion map is to In this section we explain how the theory from the previous construct the global geometry of the data set from the local section can be used to compare the intrinsic geometry of two X = {x ,x , ..., x } Y = information contained in the kernel k(x, y). The construction data sets. Let us denote by 1 2 N and {y ,y , ..., y } of the diffusion map involves the following steps. First we 1 2 M the time series obtained from the experiment normalize the kernel k(x, y) in the graph Laplacian fashion and the model simulation respectively. Using time-delayed Rn n [4] . For all x ∈ Γ coordinates we embed the time series data in , where is sufficiently large. Now we have N − n and M − n data let v2(x)= k(x, y)dµ(y) points from experimental time series and model simulation Γ time series, respectively, denoted by k(x, y) seta ˜(x, y)= ¯ v2(x) X := {x¯1, x¯2, ..., x¯N−n} ¯ Y := {y¯1, y¯2, ..., y¯ − }, (1) and a˜ satisfies a˜(x, y)dµ(y)=1.Toa˜ we can associate a M n random walk operator on the data set Γ as where x¯k =(xk,xk+1, ..., xk+n−1) and y¯k = (y ,y , .., y ) Af˜ (x)= a˜(x, y)f(y)dµ(y). k k+1 k+n−1 . We denote the union of these two data sets by Z = {X,¯ Y¯ }. In this paper we use the Since we are interested in the spectral properties of the following Gaussian kernel, operator it is preferable to work with a symmetric conjugate z − z 2 of A˜. We conjugate a˜ by v in order to obtain a symmetric k(z ,z )=exp − k j , k j (2) form and we consider k(x, y) a(x, y)= properly normalized in such a way that accounts for the den- v(x)v(y) sity of the data points [5]. In particular, we are able to handle and operator the case of model and experimental data points with the same geometric structures but different sampling densities. The Af(x)= a(x, y)f(y)dµ(y). parameter specifies the size of the neighborhoods defining the local geometry of the data. The smaller the parameter The operator A is referred to as diffusion operator. Under the faster the exponential decreases and hence the weight very general hypotheses the operator A is compact and self- function in (2) becomes numerically insignificant as we move adjoint so by spectral theory we have away from the center. It is easy to check that the Gaussian a(x, y)= λ ϕ (x)ϕ (y),Aϕ(x)=λ ϕ (x). kernel satisfies all the properties of the kernel specified in j j j j j j the previous section. j≥0 From this kernel we construct the diffusion operator or the Let am(x, y) be the kernel of Am, then at the level of data diffusion matrix using the procedure outlined in the previous am(x, y) points the kernel has a probabilistic interpretation section. Let {ϕ1,ϕ2, ..., ϕN+M−2n} be the eigenvectors of a y as a Markov chain with transition matrix to reach from the diffusion matrix and {λ1,λ2, ..., λN+M−2n} be the corre- x in m steps. The mapping sponding eigenvalues. Retaining only the first p eigenvectors we can embed the data set Z in a p-dimensional Euclidean Φ(x)=(ϕ0(x),ϕ1(x), ..., ϕp(x), ...) diffusion space, where {ϕ1, ..., ϕp} are the coordinates of (where ϕi are the eigenfunctions of diffusion operator A) the data points in the Euclidean space. Note that typically 2 maps the data set x ∈ Γ into the Euclidean space ( (N)), p<<nand hence we obtain the dimensionality reduction which we will call the diffusion space.

Comparison of Systems Using Diffusion Maps

DIFFUSION MAPS 43 5.1 Clustering

Visualization and Analysis of Single-Cell Data Using Diffusion Maps

10 Diffusion Maps

Diffusion Maps and Transfer Subspace Learning

Diffusion Maps and Transfer Subspace Learning

Arxiv:1506.07840V1 [Stat.ML] 25 Jun 2015 Are Nonlinear, Which Is Essential As Real-World Data Typically Does Not Lie on a Hyperplane

Variational Diffusion Autoencoders with Random

Diffusion Maps Meet Nystr\" Om

Diffusion Maps

An Introduction to Diffusion Maps

Diffusion Maps, Reduction Coordinates and Low Dimensional Representation of Stochastic Systems

Multivariate Time-Series Analysis and Diffusion Maps