Multilevel Wavelet Decomposition Network for Interpretable Time Series Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Multilevel Wavelet Decomposition Network for Interpretable Time Series Analysis Multilevel Wavelet Decomposition Network for Interpretable Time Series Analysis Jingyuan Wang, Ze Wang, Jianfeng Li, Junjie Wu Beihang University, Beijing, China {jywang,ze.w,leejianfeng,wujj}@buaa.edu.cn ABSTRACT 2018: 24th ACM SIGKDD International Conference on Knowledge Discovery & Recent years have witnessed the unprecedented rising of time series Data Mining, August 19–23, 2018, London, United Kingdom. ACM, New York, from almost all kindes of academic and industrial fields. Various NY, USA, 10 pages. https://doi.org/10.1145/3219819.3220060 types of deep neural network models have been introduced to time 1 INTRODUCTION series analysis, but the important frequency information is yet lack of effective modeling. In light of this, in this paper we proposea A time series is a series of data points indexed in time order. Methods wavelet-based neural network structure called multilevel Wavelet for time series analysis could be classified into two types: time- 1 Decomposition Network (mWDN) for building frequency-aware domain methods and frequency-domain methods. Time-domain deep learning models for time series analysis. mWDN preserves methods consider a time series as a sequence of ordered points the advantage of multilevel discrete wavelet decomposition in fre- and analyze correlations among them. Frequency-domain methods quency learning while enables the fine-tuning of all parameters use transform algorithms, such as discrete Fourier transform and under a deep neural network framework. Based on mWDN, we Z-transform, to transform a time series into a frequency spectrum, further propose two deep learning models called Residual Classifi- which could be used as features to analyze the original series. cation Flow (RCF) and multi-frequecy Long Short-Term Memory In recent years, with the booming of deep learning concept, var- (mLSTM) for time series classification and forecasting, respectively. ious types of deep neural network models have been introduced to The two models take all or partial mWDN decomposed sub-series time series analysis and achieved state-of-the-art performances in in different frequencies as input, and resort to the back propaga- many real-life applications [28, 38]. Some well-known models in- tion algorithm to learn all the parameters globally, which enables clude Recurrent Neural Networks (RNN) [40] and Long Short-Term seamless embedding of wavelet-based frequency analysis into deep Memory (LSTM) [14] that use memory nodes to model correlations learning frameworks. Extensive experiments on 40 UCR datasets of series points, and Convolutional Neural Network (CNN) that uses and a real-world user volume dataset demonstrate the excellent trainable convolution kernels to model local shape patterns [42]. performance of our time series models based on mWDN. In partic- Most of these models fall into the category of time-domain methods ular, we propose an importance analysis method to mWDN based without leveraging frequency information of a time series, although models, which successfully identifies those time-series elements some begin to consider in indirect ways [6, 19]. and mWDN layers that are crucially important to time series analy- Wavelet decompositions [7] are well-known methods for cap- sis. This indeed indicates the interpretability advantage of mWDN, turing features of time series both in time and frequency domains. and can be viewed as an indepth exploration to interpretable deep Intuitively, we can employ them as feature engineering tools for learning. data preprocessing before a deep modeling. While this loose cou- pling way might improve the performance of raw neural network CCS CONCEPTS models [24], they are not globally optimized with independent • Computing methodologies → Neural networks; Supervised parameter inference processes. How to integrate wavelet trans- learning by classification; Supervised learning by regression; forms into the framework of deep learning models remains a great challenge. KEYWORDS In this paper, we propose a wavelet-based neural network struc- arXiv:1806.08946v1 [cs.LG] 23 Jun 2018 ture, named multilevel Wavelet Decomposition Network (mWDN), to Time series analysis, Multilevel wavelet decomposition network, build frequency-aware deep learning models for time series analysis. Deep learning, Importance analysis Similar to the standard Multilevel Discrete Wavelet Decomposition ACM Reference Format: (MDWD) model [26], mWDN can decompose a time series into Jingyuan Wang, Ze Wang, Jianfeng Li, Junjie Wu. 2018. Multilevel Wavelet a group of sub-series with frequencies ranked from high to low, Decomposition Network for Interpretable Time Series Analysis. In KDD which is crucial for capturing frequency factors for deep learning. Permission to make digital or hard copies of all or part of this work for personal or Different from MDWD with fixed parameters, however, all parame- classroom use is granted without fee provided that copies are not made or distributed ters in mWDN can be fine-turned to fit training data of different for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM learning tasks. In other words, mWDN can take advantages of both must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, wavelet based time series decomposition and the learning ability to post on servers or to redistribute to lists, requires prior specific permission and/or a of deep neural networks. fee. Request permissions from [email protected]. KDD 2018, August 19–23, 2018, London, United Kingdom Based on mWDN, two deep learning models, i.e., Residual Classi- © 2018 Association for Computing Machinery. fication Flow (RCF) and multi-frequency Long Short-Term Memory ACM ISBN 978-1-4503-5552-0/18/08...$15.00 https://doi.org/10.1145/3219819.3220060 1https://en.wikipedia.org/wiki/Time_series 2 MODEL al (3) ↓2 Average Pooling … l ↓2 xl (3) Throughout the paper, we use lowercase symbols such as a, b to l l or h The Functions (2) al (2) x (2) denote scalars, bold lowercase symbols such as a; b to denote vec- tors, bold uppercase symbols such as A; B to denote matrices, and l ↓2 h ↓2 xh (3) … l uppercase symbols such as A; B, to denote constant. al (1) x (1) ah (3) l ↓2 h ↓2 xh (2) 2.1 Multilevel Discrete Wavelet Decomposition ah (2) Multilevel Discrete Wavelet Decomposition (MDWD) [26] is a x h ↓2 The third level xh (1) decomposition wavelet based discrete signal analysis method, which can extract ah (1) results of x multilevel time-frequency features from a time series by decom- (a) Illustration of the mWDN Framework posing the series as low and high frequency sub-series level by level. l h l l l h l or h W (i) or W (i) x (i-1) b (i-1) a (i) or a (i) We denote the input time series as x = fx1;:::; xt ;:::; xT g, and l ¹ º l the low and high sub-series generated in the i-th level as x i and x (i) h x ¹iº. In the ¹i + 1º-th level, MDWD uses a low pass filter l = fl1; :::;l ;:::;lK g and a high pass filter h = fh1;:::;h ;:::;hK g, Average k k h + = Pooling K ≪ T , to convolute low frequency sub-series of the upper level as K Õ l ¹ º l ¹ º · an i + 1 = xn+k−1 i lk ; k=1 (1) K Õ (b) Approximative Discrete Wavelet Transform h ¹ º l ¹ º · an i + 1 = xn+k−1 i hk ; k=1 Figure 1: The mWDN framework. l where xn¹iº is the n-th element of the low frequency sub-series in the i-th level, and xl ¹0º is set as the input series. The low and high frequency sub-series xl ¹iº and xh¹iº in the level i are generated from the 1/2 down-sampling of the intermediate variable sequences n o n o l ¹ º l ¹ º l ¹ º h¹ º h¹ º h¹ º a i = a1 i ; a2 i ;::: and a i = a1 i ; a2 i ;::: . n o The sub-series set X¹iº = xh¹1º; xh¹2º;:::; xh¹iº; xl ¹iº is (mLSTM), are designed for time series classification (TSC) and fore- called as the i-th level decomposed results of x. Specifically, X¹iº casting (TSF), respectively. The key issue in TSC is to extract as satisfies: 1) We can fully reconstruct x from X¹iº; 2) The frequency many as possible representative features from time series. The RCF from xh¹1º to xl ¹iº is from high to low; 3) For different layers, X¹iº model therefore adopts the mWDN decomposed results in different has different time and frequency resolutions. As i increases, the fre- levels as inputs, and employs a pipelined classifier stack to exploit quency resolution is increasing and the time resolution, especially features hidden in sub-series through residual learning methods. for low frequency sub-series, is decreasing. For the TSF problem, the key issue turns to inferring future states Because the sub-series with different frequencies in X keep the of a time series according to the hidden trends in different frequen- same order information with the original series x, MDWD is re- cies. Therefore, the mLSTM model feeds all mWDN decomposed garded as time-frequency decomposition. sub-series in high frequencies into independent LSTM models, and ensembles all LSTM outputs for final forecasting. Note that all pa- 2.2 Multilevel Wavelet Decomposition rameters of RCF and mLSTM including the ones in mWDN are Network trained using the back propagation algorithm in an end-to-end man- In this section, we propose a multilevel Wavelet Decomposition ner. In this way, the wavelet-based frequency analysis is seamlessly Network (mWDN), which approximatively implements a MDWD embedded into deep learning frameworks.
Recommended publications
  • Kalman and Particle Filtering
    Abstract: The Kalman and Particle filters are algorithms that recursively update an estimate of the state and find the innovations driving a stochastic process given a sequence of observations. The Kalman filter accomplishes this goal by linear projections, while the Particle filter does so by a sequential Monte Carlo method. With the state estimates, we can forecast and smooth the stochastic process. With the innovations, we can estimate the parameters of the model. The article discusses how to set a dynamic model in a state-space form, derives the Kalman and Particle filters, and explains how to use them for estimation. Kalman and Particle Filtering The Kalman and Particle filters are algorithms that recursively update an estimate of the state and find the innovations driving a stochastic process given a sequence of observations. The Kalman filter accomplishes this goal by linear projections, while the Particle filter does so by a sequential Monte Carlo method. Since both filters start with a state-space representation of the stochastic processes of interest, section 1 presents the state-space form of a dynamic model. Then, section 2 intro- duces the Kalman filter and section 3 develops the Particle filter. For extended expositions of this material, see Doucet, de Freitas, and Gordon (2001), Durbin and Koopman (2001), and Ljungqvist and Sargent (2004). 1. The state-space representation of a dynamic model A large class of dynamic models can be represented by a state-space form: Xt+1 = ϕ (Xt,Wt+1; γ) (1) Yt = g (Xt,Vt; γ) . (2) This representation handles a stochastic process by finding three objects: a vector that l describes the position of the system (a state, Xt X R ) and two functions, one mapping ∈ ⊂ 1 the state today into the state tomorrow (the transition equation, (1)) and one mapping the state into observables, Yt (the measurement equation, (2)).
    [Show full text]
  • Lecture 19: Wavelet Compression of Time Series and Images
    Lecture 19: Wavelet compression of time series and images c Christopher S. Bretherton Winter 2014 Ref: Matlab Wavelet Toolbox help. 19.1 Wavelet compression of a time series The last section of wavelet leleccum notoolbox.m demonstrates the use of wavelet compression on a time series. The idea is to keep the wavelet coefficients of largest amplitude and zero out the small ones. 19.2 Wavelet analysis/compression of an image Wavelet analysis is easily extended to two-dimensional images or datasets (data matrices), by first doing a wavelet transform of each column of the matrix, then transforming each row of the result (see wavelet image). The wavelet coeffi- cient matrix has the highest level (largest-scale) averages in the first rows/columns, then successively smaller detail scales further down the rows/columns. The ex- ample also shows fine results with 50-fold data compression. 19.3 Continuous Wavelet Transform (CWT) Given a continuous signal u(t) and an analyzing wavelet (x), the CWT has the form Z 1 s − t W (λ, t) = λ−1=2 ( )u(s)ds (19.3.1) −∞ λ Here λ, the scale, is a continuous variable. We insist that have mean zero and that its square integrates to 1. The continuous Haar wavelet is defined: 8 < 1 0 < t < 1=2 (t) = −1 1=2 < t < 1 (19.3.2) : 0 otherwise W (λ, t) is proportional to the difference of running means of u over successive intervals of length λ/2. 1 Amath 482/582 Lecture 19 Bretherton - Winter 2014 2 In practice, for a discrete time series, the integral is evaluated as a Riemann sum using the Matlab wavelet toolbox function cwt.
    [Show full text]
  • Adaptive Wavelet Clustering for Highly Noisy Data
    Adaptive Wavelet Clustering for Highly Noisy Data Zengjian Chen Jiayi Liu Yihe Deng Department of Computer Science Department of Computer Science Department of Mathematics Huazhong University of University of Massachusetts Amherst University of California, Los Angeles Science and Technology Massachusetts, USA California, USA Wuhan, China [email protected] [email protected] [email protected] Kun He* John E. Hopcroft Department of Computer Science Department of Computer Science Huazhong University of Science and Technology Cornell University Wuhan, China Ithaca, NY, USA [email protected] [email protected] Abstract—In this paper we make progress on the unsupervised Based on the pioneering work of Sheikholeslami that applies task of mining arbitrarily shaped clusters in highly noisy datasets, wavelet transform, originally used for signal processing, on which is a task present in many real-world applications. Based spatial data clustering [12], we propose a new wavelet based on the fundamental work that first applies a wavelet transform to data clustering, we propose an adaptive clustering algorithm, algorithm called AdaWave that can adaptively and effectively denoted as AdaWave, which exhibits favorable characteristics for uncover clusters in highly noisy data. To tackle general appli- clustering. By a self-adaptive thresholding technique, AdaWave cations, we assume that the clusters in a dataset do not follow is parameter free and can handle data in various situations. any specific distribution and can be arbitrarily shaped. It is deterministic, fast in linear time, order-insensitive, shape- To show the hardness of the clustering task, we first design insensitive, robust to highly noisy data, and requires no pre- knowledge on data models.
    [Show full text]
  • A Fourier-Wavelet Monte Carlo Method for Fractal Random Fields
    JOURNAL OF COMPUTATIONAL PHYSICS 132, 384±408 (1997) ARTICLE NO. CP965647 A Fourier±Wavelet Monte Carlo Method for Fractal Random Fields Frank W. Elliott Jr., David J. Horntrop, and Andrew J. Majda Courant Institute of Mathematical Sciences, 251 Mercer Street, New York, New York 10012 Received August 2, 1996; revised December 23, 1996 2 2H k[v(x) 2 v(y)] l 5 CHux 2 yu , (1.1) A new hierarchical method for the Monte Carlo simulation of random ®elds called the Fourier±wavelet method is developed and where 0 , H , 1 is the Hurst exponent and k?l denotes applied to isotropic Gaussian random ®elds with power law spectral the expected value. density functions. This technique is based upon the orthogonal Here we develop a new Monte Carlo method based upon decomposition of the Fourier stochastic integral representation of the ®eld using wavelets. The Meyer wavelet is used here because a wavelet expansion of the Fourier space representation of its rapid decay properties allow for a very compact representation the fractal random ®elds in (1.1). This method is capable of the ®eld. The Fourier±wavelet method is shown to be straightfor- of generating a velocity ®eld with the Kolmogoroff spec- ward to implement, given the nature of the necessary precomputa- trum (H 5 Ad in (1.1)) over many (10 to 15) decades of tions and the run-time calculations, and yields comparable results scaling behavior comparable to the physical space multi- with scaling behavior over as many decades as the physical space multiwavelet methods developed recently by two of the authors.
    [Show full text]
  • Alternative Tests for Time Series Dependence Based on Autocorrelation Coefficients
    Alternative Tests for Time Series Dependence Based on Autocorrelation Coefficients Richard M. Levich and Rosario C. Rizzo * Current Draft: December 1998 Abstract: When autocorrelation is small, existing statistical techniques may not be powerful enough to reject the hypothesis that a series is free of autocorrelation. We propose two new and simple statistical tests (RHO and PHI) based on the unweighted sum of autocorrelation and partial autocorrelation coefficients. We analyze a set of simulated data to show the higher power of RHO and PHI in comparison to conventional tests for autocorrelation, especially in the presence of small but persistent autocorrelation. We show an application of our tests to data on currency futures to demonstrate their practical use. Finally, we indicate how our methodology could be used for a new class of time series models (the Generalized Autoregressive, or GAR models) that take into account the presence of small but persistent autocorrelation. _______________________________________________________________ An earlier version of this paper was presented at the Symposium on Global Integration and Competition, sponsored by the Center for Japan-U.S. Business and Economic Studies, Stern School of Business, New York University, March 27-28, 1997. We thank Paul Samuelson and the other participants at that conference for useful comments. * Stern School of Business, New York University and Research Department, Bank of Italy, respectively. 1 1. Introduction Economic time series are often characterized by positive
    [Show full text]
  • Time Series: Co-Integration
    Time Series: Co-integration Series: Economic Forecasting; Time Series: General; Watson M W 1994 Vector autoregressions and co-integration. Time Series: Nonstationary Distributions and Unit In: Engle R F, McFadden D L (eds.) Handbook of Econo- Roots; Time Series: Seasonal Adjustment metrics Vol. IV. Elsevier, The Netherlands N. H. Chan Bibliography Banerjee A, Dolado J J, Galbraith J W, Hendry D F 1993 Co- Integration, Error Correction, and the Econometric Analysis of Non-stationary Data. Oxford University Press, Oxford, UK Time Series: Cycles Box G E P, Tiao G C 1977 A canonical analysis of multiple time series. Biometrika 64: 355–65 Time series data in economics and other fields of social Chan N H, Tsay R S 1996 On the use of canonical correlation science often exhibit cyclical behavior. For example, analysis in testing common trends. In: Lee J C, Johnson W O, aggregate retail sales are high in November and Zellner A (eds.) Modelling and Prediction: Honoring December and follow a seasonal cycle; voter regis- S. Geisser. Springer-Verlag, New York, pp. 364–77 trations are high before each presidential election and Chan N H, Wei C Z 1988 Limiting distributions of least squares follow an election cycle; and aggregate macro- estimates of unstable autoregressive processes. Annals of Statistics 16: 367–401 economic activity falls into recession every six to eight Engle R F, Granger C W J 1987 Cointegration and error years and follows a business cycle. In spite of this correction: Representation, estimation, and testing. Econo- cyclicality, these series are not perfectly predictable, metrica 55: 251–76 and the cycles are not strictly periodic.
    [Show full text]
  • A Practical Guide to Wavelet Analysis
    A Practical Guide to Wavelet Analysis Christopher Torrence and Gilbert P. Compo Program in Atmospheric and Oceanic Sciences, University of Colorado, Boulder, Colorado ABSTRACT A practical step-by-step guide to wavelet analysis is given, with examples taken from time series of the El Niño– Southern Oscillation (ENSO). The guide includes a comparison to the windowed Fourier transform, the choice of an appropriate wavelet basis function, edge effects due to finite-length time series, and the relationship between wavelet scale and Fourier frequency. New statistical significance tests for wavelet power spectra are developed by deriving theo- retical wavelet spectra for white and red noise processes and using these to establish significance levels and confidence intervals. It is shown that smoothing in time or scale can be used to increase the confidence of the wavelet spectrum. Empirical formulas are given for the effect of smoothing on significance levels and confidence intervals. Extensions to wavelet analysis such as filtering, the power Hovmöller, cross-wavelet spectra, and coherence are described. The statistical significance tests are used to give a quantitative measure of changes in ENSO variance on interdecadal timescales. Using new datasets that extend back to 1871, the Niño3 sea surface temperature and the Southern Oscilla- tion index show significantly higher power during 1880–1920 and 1960–90, and lower power during 1920–60, as well as a possible 15-yr modulation of variance. The power Hovmöller of sea level pressure shows significant variations in 2–8-yr wavelet power in both longitude and time. 1. Introduction complete description of geophysical applications can be found in Foufoula-Georgiou and Kumar (1995), Wavelet analysis is becoming a common tool for while a theoretical treatment of wavelet analysis is analyzing localized variations of power within a time given in Daubechies (1992).
    [Show full text]
  • An Overview of Wavelet Transform Concepts and Applications
    An overview of wavelet transform concepts and applications Christopher Liner, University of Houston February 26, 2010 Abstract The continuous wavelet transform utilizing a complex Morlet analyzing wavelet has a close connection to the Fourier transform and is a powerful analysis tool for decomposing broadband wavefield data. A wide range of seismic wavelet applications have been reported over the last three decades, and the free Seismic Unix processing system now contains a code (succwt) based on the work reported here. Introduction The continuous wavelet transform (CWT) is one method of investigating the time-frequency details of data whose spectral content varies with time (non-stationary time series). Moti- vation for the CWT can be found in Goupillaud et al. [12], along with a discussion of its relationship to the Fourier and Gabor transforms. As a brief overview, we note that French geophysicist J. Morlet worked with non- stationary time series in the late 1970's to find an alternative to the short-time Fourier transform (STFT). The STFT was known to have poor localization in both time and fre- quency, although it was a first step beyond the standard Fourier transform in the analysis of such data. Morlet's original wavelet transform idea was developed in collaboration with the- oretical physicist A. Grossmann, whose contributions included an exact inversion formula. A series of fundamental papers flowed from this collaboration [16, 12, 13], and connections were soon recognized between Morlet's wavelet transform and earlier methods, including harmonic analysis, scale-space representations, and conjugated quadrature filters. For fur- ther details, the interested reader is referred to Daubechies' [7] account of the early history of the wavelet transform.
    [Show full text]
  • Generating Time Series with Diverse and Controllable Characteristics
    GRATIS: GeneRAting TIme Series with diverse and controllable characteristics Yanfei Kang,∗ Rob J Hyndman,† and Feng Li‡ Abstract The explosion of time series data in recent years has brought a flourish of new time series analysis methods, for forecasting, clustering, classification and other tasks. The evaluation of these new methods requires either collecting or simulating a diverse set of time series benchmarking data to enable reliable comparisons against alternative approaches. We pro- pose GeneRAting TIme Series with diverse and controllable characteristics, named GRATIS, with the use of mixture autoregressive (MAR) models. We simulate sets of time series using MAR models and investigate the diversity and coverage of the generated time series in a time series feature space. By tuning the parameters of the MAR models, GRATIS is also able to efficiently generate new time series with controllable features. In general, as a costless surrogate to the traditional data collection approach, GRATIS can be used as an evaluation tool for tasks such as time series forecasting and classification. We illustrate the usefulness of our time series generation process through a time series forecasting application. Keywords: Time series features; Time series generation; Mixture autoregressive models; Time series forecasting; Simulation. 1 Introduction With the widespread collection of time series data via scanners, monitors and other automated data collection devices, there has been an explosion of time series analysis methods developed in the past decade or two. Paradoxically, the large datasets are often also relatively homogeneous in the industry domain, which limits their use for evaluation of general time series analysis methods (Keogh and Kasetty, 2003; Mu˜nozet al., 2018; Kang et al., 2017).
    [Show full text]
  • Fourier, Wavelet and Monte Carlo Methods in Computational Finance
    Fourier, Wavelet and Monte Carlo Methods in Computational Finance Kees Oosterlee1;2 1CWI, Amsterdam 2Delft University of Technology, the Netherlands AANMPDE-9-16, 7/7/2016 Kees Oosterlee (CWI, TU Delft) Comp. Finance AANMPDE-9-16 1 / 51 Joint work with Fang Fang, Marjon Ruijter, Luis Ortiz, Shashi Jain, Alvaro Leitao, Fei Cong, Qian Feng Agenda Derivatives pricing, Feynman-Kac Theorem Fourier methods Basics of COS method; Basics of SWIFT method; Options with early-exercise features COS method for Bermudan options Monte Carlo method BSDEs, BCOS method (very briefly) Kees Oosterlee (CWI, TU Delft) Comp. Finance AANMPDE-9-16 1 / 51 Agenda Derivatives pricing, Feynman-Kac Theorem Fourier methods Basics of COS method; Basics of SWIFT method; Options with early-exercise features COS method for Bermudan options Monte Carlo method BSDEs, BCOS method (very briefly) Joint work with Fang Fang, Marjon Ruijter, Luis Ortiz, Shashi Jain, Alvaro Leitao, Fei Cong, Qian Feng Kees Oosterlee (CWI, TU Delft) Comp. Finance AANMPDE-9-16 1 / 51 Feynman-Kac theorem: Z T v(t; x) = E g(s; Xs )ds + h(XT ) ; t where Xs is the solution to the FSDE dXs = µ(Xs )ds + σ(Xs )d!s ; Xt = x: Feynman-Kac Theorem The linear partial differential equation: @v(t; x) + Lv(t; x) + g(t; x) = 0; v(T ; x) = h(x); @t with operator 1 Lv(t; x) = µ(x)Dv(t; x) + σ2(x)D2v(t; x): 2 Kees Oosterlee (CWI, TU Delft) Comp. Finance AANMPDE-9-16 2 / 51 Feynman-Kac Theorem The linear partial differential equation: @v(t; x) + Lv(t; x) + g(t; x) = 0; v(T ; x) = h(x); @t with operator 1 Lv(t; x) = µ(x)Dv(t; x) + σ2(x)D2v(t; x): 2 Feynman-Kac theorem: Z T v(t; x) = E g(s; Xs )ds + h(XT ) ; t where Xs is the solution to the FSDE dXs = µ(Xs )ds + σ(Xs )d!s ; Xt = x: Kees Oosterlee (CWI, TU Delft) Comp.
    [Show full text]
  • Wavelet Clustering in Time Series Analysis
    Wavelet clustering in time series analysis Carlo Cattani and Armando Ciancio Abstract In this paper we shortly summarize the many advantages of the discrete wavelet transform in the analysis of time series. The data are transformed into clusters of wavelet coefficients and rate of change of the wavelet coefficients, both belonging to a suitable finite dimensional domain. It is shown that the wavelet coefficients are strictly related to the scheme of finite differences, thus giving information on the first and second order properties of the data. In particular, this method is tested on financial data, such as stock pricings, by characterizing the trends and the abrupt changes. The wavelet coefficients projected into the phase space give rise to characteristic cluster which are connected with the volativility of the time series. By their localization they represent a sufficiently good and new estimate of the risk. Mathematics Subject Classification: 35A35, 42C40, 41A15, 47A58; 65T60, 65D07. Key words: Wavelet, Haar function, time series, stock pricing. 1 Introduction In the analysis of time series such as financial data, one of the main task is to con- struct a model able to capture the main features of the data, such as fluctuations, trends, jumps, etc. If the model is both well fitting with the already available data and is expressed by some mathematical relations then it can be temptatively used to forecast the evolution of the time series according to the given model. However, in general any reasonable model depends on an extremely large amount of parameters both deterministic and/or probabilistic [10, 18, 23, 2, 1, 29, 24, 25, 3] implying a com- plex set of complicated mathematical relations.
    [Show full text]
  • Statistical Hypothesis Testing in Wavelet Analysis: Theoretical Developments and Applications to India Rainfall Justin A
    Nonlin. Processes Geophys. Discuss., https://doi.org/10.5194/npg-2018-55 Manuscript under review for journal Nonlin. Processes Geophys. Discussion started: 13 December 2018 c Author(s) 2018. CC BY 4.0 License. Statistical Hypothesis Testing in Wavelet Analysis: Theoretical Developments and Applications to India Rainfall Justin A. Schulte1 1Science Systems and Applications, Inc,, Lanham, 20782, United States 5 Correspondence to: Justin A. Schulte ([email protected]) Abstract Statistical hypothesis tests in wavelet analysis are reviewed and developed. The output of a recently developed cumulative area-wise is shown to be the ensemble mean of individual estimates of statistical significance calculated from a geometric test assessing statistical significance based on the area of contiguous regions (i.e. patches) 10 of point-wise significance. This new interpretation is then used to construct a simplified version of the cumulative area-wise test to improve computational efficiency. Ideal examples are used to show that the geometric and cumulative area-wise tests are unable to differentiate features arising from singularity-like structures from those associated with periodicities. A cumulative arc-wise test is therefore developed to test for periodicities in a strict sense. A previously proposed topological significance test is formalized using persistent homology profiles (PHPs) measuring the number 15 of patches and holes corresponding to the set of all point-wise significance values. Ideal examples show that the PHPs can be used to distinguish time series containing signal components from those that are purely noise. To demonstrate the practical uses of the existing and newly developed statistical methodologies, a first comprehensive wavelet analysis of India rainfall is also provided.
    [Show full text]