Local Correlation Tracking in Time Series Spiros Papadimitriou§ Jimeng Sun‡ Philip S

Local Correlation Tracking in Time Series Spiros Papadimitriou§ Jimeng Sun‡ Philip S. Yu§ § IBM T.J. Watson Research Center ‡ Carnegie Mellon University Hawthorne, NY, USA Pittsburgh, PA, USA Abstract capture various trend or pattern types. Data with such features arise in several application do- We address the problem of capturing and tracking lo- mains, such as: cal correlations among time evolving time series. Our ap- • Monitoring of network traffic flows or of system per- proach is based on comparing the local auto-covariance formance metrics (e.g., CPU and memory utilization, matrices (via their spectral decompositions) of each series I/O throughput, etc), where changing workload char- and generalizes the notion of linear cross-correlation. In acteristics may introduce non-stationarity. this way, it is possible to concisely capture a wide variety of local patterns or trends. Our method produces a gen- • Financial applications, where prices may exhibit linear eral similarity score, which evolves over time, and accu- or seasonal trends, as well as time-varying volatility. rately reflects the changing relationships. Finally, it can • Medical applications, such as EEGs (electroen- also be estimated incrementally, in a streaming setting. We cephalograms) [4]. demonstrate its usefulness, robustness and efficiency on a Figure 1 shows the exchange rates for the French Franc wide range of real datasets. (blue) and the Spanish Peseta (red) versus the US Dollar, over a period of about 10 years. An approximate timeline of major events in the European Monetary Union (EMU) 1 Introduction is also included, which may help explain the behavior of each currency. The global cross-correlation coefficient of The notion of correlation (or, similarity) is important, the two series is 0.30, which is statistically significant (ex- since it allows us to discover groups of objects with sim- ceeding the 95% confidence interval of ±0.04). The next ilar behavior and, consequently, discover potential anoma- local extremum of the cross-correlation function is 0.34, at lies which may be revealed by a change in correlation. In a lag of 323 working days, meaning that the overall behav- this paper we consider correlation among time series which ior of the Franc is similar to that of the Peseta 15 months often exhibit two important properties. ago, when compared over the entire decade of daily data. First, their characteristics may change over time. In fact, Franc / Peseta this is a key property of semi-infinite streams, where data arrive continuously. The term time-evolving is often used in this context to imply the presence of non-stationarity. In this case, a single, static correlation score for the entire time series is less useful. Instead, it is desirable to have a notion of correlation that also evolves with time and tracks the chang- 0 500 1000 1500 2000 2500 ing relationships. On the other hand, a time-evolving cor- LoCo relation score should not be overly sensitive to transients; if 1 the score changes wildly, then its usefulness is limited. The second property is that many time series exhibit 0.8 strong but fairly complex, non-linear correlations. Tradi- 0.6 tional measures, such as the widely used cross-correlation 0 500 1000 1500 2000 2500 Jun 88 Time coefficient (or, Pearson coefficient), are less effective in cap- Jan 94 Apr 89 Jul 93 Jun 89 turing these complex relationships. From a general point Delors report req. May 93 EMU Stage 2 Delors report publ. Jan 93 Bundesbank buys Francs of view, the estimation of a correlation score relies on an Peseta joins ERM Jul 90 Peseta devalued assumed joint model of the two sequences. For example, EMU Stage 1 Feb 92 Oct 92 "Single Market" begins Maastricht treaty Peseta devalued, Franc under siege the cross-correlation coefficient assumes that pairs of values from each series follow a simple linear relationship. Con- Figure 1. Illustration of tracking time-evolving sequently, we seek a concise but powerful model that can local correlations (see also Figure 6). 1 This information makes sense and is useful in its own as an indexed collection X of random variables Xt, t ∈ N, right. However, it is not particularly enlightening about the i.e., X = {X1,X2,...,Xt,...} ≡ {Xt}t∈N. Without loss relationship of the two currencies as they evolve over time. of generality, we will assume zero-mean time series, i.e., Similar techniques can be employed to characterize corre- E[Xt] = 0 for all t ∈ N. The values of a particular realiza- lations or similarities over a period of, say, a few years. tion of X are denoted by lower-case letters, xt ∈ R, at time But what if we wish to track the evolving relationships over t ∈ N. shorter periods, say a few weeks? The bottom part of Fig- Covariance and autocovariance. The covariance of two ure 1 shows our local correlation score computed over a random variables X, Y is defined as Cov[X, Y ] = E[(X − window of four weeks (or 20 values). It is worth noting E[X])(Y − E[Y ])]. If X1,X2,...,Xm is a group of m that most major EMU events are closely accompanied by random variables, their covariance matrix C ∈ Rm×m is a correlation drop, and vice versa. Also, events related to the symmetric matrix defined by cij := Cov[Xi,Xj], for anticipated regulatory changes are typically preceded, but 1 ≤ i, j ≤ m. If x1, x2,..., xn is a collection of n obser- not followed, by correlation breaks. Overall, our correla- T vations xi ≡ [xi,1, xi,2, . , xi,m] of all m variables, the tion score smoothly tracks the evolving correlations among sample covariance estimate1 is defined as the two currencies (cf. Figure 6). To summarize, our goal is to define a powerful and con- 1 Xn Cˆ := x ⊗ x . cise model that can capture complex correlations between n i i time series. Furthermore, the model should allow tracking i=1 the time-evolving nature of these correlations in a robust In the context of a time series process {Xt}t∈N, we way, which is not susceptible to transients. In other words, are interested in the relationship between values at differ- the score should accurately reflect the time-varying relation- ent times. To that end, the autocovariance is defined as ships among the series. γt,t0 := Cov[Xt,Xt0 ] = E[XtXt0 ], where the last equal- Contributions. Our main contributions are the following: ity follows from the zero-mean assumption. By definition, γt,t0 = γt0,t. • We introduce LoCo (LOcal COrrelation), a time- Spectral decomposition. Any real symmetric matrix is evolving, local similarity score for time series, by gen- always equivalent to a diagonal matrix, in the following eralizing the notion of cross-correlation coefficient. sense. • The model upon which our score is based can capture Theorem 1. If A ∈ Rn×n is a symmetric, real matrix, then fairly complex relationships and track their evolution. it is always possible to find a column-orthonormal matrix The linear cross-correlation coefficient is included as a U ∈ Rn×n and a diagonal matrix Λ ∈ Rn×n, such that special case. A = UΛUT. • Our approach is also amenable to robust streaming es- Thus, given any vector x, we can write UT(Ax) = timation. Λ(UTx), where pre-multiplication by UT amounts to a We illustrate our proposed method or real data, discussing change of coordinates. Intuitively, if we use the coordinate its qualitative interpretation, comparing it against natural al- system defined by U, then Ax can be calculated by simply ternatives and demonstrating its robustness and efficiency. scaling each coordinate independently of all the rest (i.e., multiplying by the diagonal matrix Λ). The rest of the paper is organized as follows: In Sec- Given any symmetric matrix A ∈ Rn×n, we will de- tion 2 we briefly describe some of the necessary background note its eigenvectors by ui(A) and the corresponding eigen- and notation. In Section 3 we define some basic notions. values by λi(A), in order of decreasing magnitude, where Section 4 describes our proposed approach and Section 5 1 ≤ i ≤ n. The matrix Uk(A) has the first k eigenvectors presents our experimental evaluation on real data. Finally, as its columns, where 1 ≤ k ≤ n. in Section 6 we describe some of the related work and Sec- The covariance matrix C of m variables is symmetric tion 7 concludes. by definition. Its spectral decomposition provides the di- rections in Rm that “explain” the most of the variance. If T 2 Background we project [X1,X2,...,Xm] onto the subspace spanned by Uk(C), we retain the largest fraction of variance among In the following, we use lowercase bold letters for col- any other k-dimensional subspace [11]. Finally, the auto- umn vectors (u, v,...) and uppercase bold for matrices covariance matrix of a finite-length time series is also sym- (U, V,...). The inner product of two vectors is denoted metric and its eigenvectors typically capture both the key T T by x y and the outer product by x⊗ y ≡ xy . The Eu- 1The unbiased estimator uses n−1 instead of n, but this constant factor clidean norm of x is kxk. We denote a time series process does not affect the eigen-decomposition. 2 oscillatory (e.g., sinusoidal) as well as aperiodic (e.g., in- Symbol Description creasing or decreasing) trends that are present [6, 7]. U, V Matrix (uppercase bold). u, v Column vector (lowercase bold). 3 Localizing correlation estimates xt Time series, t ∈ N. w Window size. w Our goal is to derive a time-evolving correlation scores xt,w Window starting at t, xt,w ∈ R .

Load more