Wavelet Entropy-Based Evaluation of Intrinsic Predictability of Time Series

Chaos ARTICLE scitation.org/journal/cha Wavelet entropy-based evaluation of intrinsic predictability of time series Cite as: Chaos 30, 033117 (2020); doi: 10.1063/1.5145005 Submitted: 13 January 2020 Accepted: 17 February 2020 · · Published Online: 9 March 2020 View Online Export Citation CrossMark Ravi Kumar Guntu,1 Pavan Kumar Yeditha,2 Maheswaran Rathinasamy,2 MatjažPerc,3,4,5 Norbert Marwan,6 Jürgen Kurths,6,7 and Ankit Agarwal1,a) AFFILIATIONS 1Department of Hydrology, Indian Institute of Technology Roorkee, Roorkee 247667, India 2Department of Civil Engineering, MVGR College of Engineering, Vizianagaram 535005, India 3Faculty of Natural Sciences and Mathematics, University of Maribor, 2000 Maribor, Slovenia 4Center for Applied Mathematics and Theoretical Physics, University of Maribor, 2000 Maribor, Slovenia 5Department of Medical Research, China Medical University Hospital, China Medical University, Taichung 40402, Taiwan 6Potsdam Institute for Climate Impact Research, Telegrafenberg, 14412 Potsdam, Germany 7Institute of Physics, Humboldt University, 12489 Berlin, Germany Note: This article is part of the Focus Issue, Rare Events in Complex Systems: Understanding and Prediciton. a)Author to whom correspondence should be addressed: [email protected] ABSTRACT Intrinsic predictability is imperative to quantify inherent information contained in a time series and assists in evaluating the performance of different forecasting methods to get the best possible prediction. Model forecasting performance is the measure of the probability of success. Nevertheless, model performance or the model does not provide understanding for improvement in prediction. Intuitively, intrinsic predictability delivers the highest level of predictability for a time series and informative in unfolding whether the system is unpredictable or the chosen model is a poor choice. We introduce a novel measure, the Wavelet Entropy Energy Measure (WEEM), based on wavelet transformation and information entropy for quantification of intrinsic predictability of time series. To investigate the efficiency and reliability of the proposed measure, model forecast performance was evaluated via a wavelet networks approach. The proposed measure uses the wavelet energy distribution of a time series at different scales and compares it with the wavelet energy distribution of white noise to quantify a time series as deterministic or random. We test the WEEM using a wide variety of time series ranging from deterministic, non-stationary, and ones contaminated with white noise with different noise-signal ratios. Furthermore, a relationship is developed between the WEEM and Nash–Sutcliffe Efficiency, one of the widely known measures of forecast performance. The reliability of WEEM is demonstrated by exploring the relationship to logistic map and real-world data. Published under license by AIP Publishing. https://doi.org/10.1063/1.5145005 This study explores the application of wavelet energy I. INTRODUCTION function and entropy for possible quantification of intrinsic Time series predictability is a measure of how well future values predictability of a time series in terms of the Wavelet Energy of a time series can be forecasted,1 where a time series is a sequence Entropy Measure. One of the advantages of the proposed mea- of observations Xt,t 1, 2, 3, ...,N. A time series generated by sure is that it considers the dynamics of the process spread a deterministic process{ = has high predictability,} and its future val- across different time scales, which other similarity measures ues can be forecasted very well from its past values. However, the of predictability have not considered explicitly. Furthermore, one by an uncorrelated stochastic process has low predictability, the proposed measure is linked to forecasting performances. and its past values provide only a statistical characterization of The proposed measure can be used for estimating the intrin- the future values.2 According to Pennekamp et al., predictability sic predictability of a time series, understanding the capabil- can be classified into realized (achieved predictability of a system ity of models in capturing the underlying system, and among from a given forecasting model) and intrinsic (maximum achievable others. predictability of a system4) predictability.3 The former indicates the Chaos 30,033117(2020);doi:10.1063/1.5145005 30,033117-1 Published under license by AIP Publishing. Chaos ARTICLE scitation.org/journal/cha forecast performance of models and is generally quantified as the (WEEM). The proposed measure is validated using a set of con- correlation coefficient among predicted and observed values or its structed signals ranging from white noise to deterministic processes. complement error measures such as Nash–Sutcliffe Efficiency. Dif- Furthermore, we connect WEEM to well-known forecast measure ferent models will provide different levels of realized predictability (Nash–Sutcliffe Criteria) using a state-of-the-art forecasting model and just from the realized predictability, it is impossible to say based on Wavelet Neural Networks. This link between intrinsic whether the system is complex or the model is a poor choice. By con- predictability and realized predictability helps in comparing the trast, the intrinsic predictability of a time series (or the underlying intrinsic predictability of different systems and provides a guideline system) is an absolute measure that represents the highest achievable on model selection. To test the strength of the relationship, we apply predictability.1,4 For instance, a time series having a complete predic- it to logistic map time series and 4500 real-world rainfall time series tive structure has high intrinsic predictability. Periodic and constant of the Indian subcontinent. We select monsoon rainfall since it is signals fall in this category. On the contrary, a fully complex signal one of the most critical climate element affecting the livelihood and (e.g., white noise) that is generated by an uncorrelated process has wellbeing of the society. The variability, strength, onset, and with- no intrinsic predictability and attains low forecasting performance drawal of monsoonal rainfall have an enormous effect on Indian for deterministic chaotic processes with low correlation (e.g., logistic agriculture, economy, life, and prosperity of the inhabitants of the map for certain parameters). Indian subcontinent. Consequently, understanding the mechanisms Generally, the intrinsic predictability can be measured using of the Indian monsoon and its successful forecasting is not only model-free measures of time series complexity. Some of these a question of great scientific interest but also a significant societal measures include Lyapunov exponents,5 recurrence measures,6–8 challenge. network measures,64 and information entropic measures.9,10 Com- The remainder of the paper is organized as follows: Section II paratively, entropic measures have received more popularity owing describes the development of the proposed measure, and Sec. III to its simplicity in calculating the intrinsic predictability, in introduces the selected model signals and real-time series. The particular within the fields of human mobility,11 stock price results are discussed in Sec. IV. Our conclusions are summarized returns,12 patient health records,13,14 and also in nonlinear climatic in Sec. V. systems.15–19 The range of entropy varies from highest for a white noise process and to lowest in case of the deterministic chaotic II. METHODS processes (e.g., logistic map for certain parameters).20,21 Indeed, entropy measures have contributed significantly to our Here, we describe the methodology for the proposed WEEM. existing knowledge on time series predictability. Still, there is a Here, we combine two already well-established approaches, infor- substantial scope of advancement, in particular, while considering mation entropy and continuous wavelet transform (CWT), to quan- the underlying system whose dynamics are spread over different tify the time series predictability. The following sub-sections briefly time scales. In such systems, it is entirely plausible that determin- introduce Information entropy and wavelets and subsequently pro- istic features present at a specific scale might be concealed while vide the mathematical framework for WEEM. examining the process at a different time scale. Doss-Gollin et al. stressed the necessity of considering the multiscale dynamics across A. Information entropy different time scales for robust adaptation.21 Maheswaran and Khosa emphasized the need for multiple-scale analysis for effective fore- The Information entropy originally proposed by Shannon37 is casting of hydrological processes.22 According to Barrat et al., described as specific climate-related processes are highly complex due to the N scale-emergent occurrences owing to the nonlinear dynamic pro- IE(X) P(xi)log2[P(xi)], (1) i 1 23 =− = cess and long-memory temporal connections. These studies and ! several others24–28 indicate the presence of features of multiple time where X x ,x,x,x ... X is the sample space with all possi- ={ 1 2 3 4, N} scale phenomena, and these often get camouflaged in the observed ble states; P(xi) is the probability of occurrence of the random event N time series. Li and Zhang showed that the traditional measures based xi;0 P(xi) 1; i 1 P(xi) 1; log2P(xi) is the information of 29 ≤ ≤ = = − on entropy fail to estimate the predictability at multiple time scales. uncertainty

Wavelet Entropy-Based Evaluation of Intrinsic Predictability of Time Series

Lecture 19: Wavelet Compression of Time Series and Images

Wavelet Entropy Energy Measure (WEEM): a Multiscale Measure to Grade a Geophysical System's Predictability

Adaptive Wavelet Clustering for Highly Noisy Data

A Fourier-Wavelet Monte Carlo Method for Fractal Random Fields

Biostatistics for Oral Healthcare

A Practical Guide to Wavelet Analysis

An Overview of Wavelet Transform Concepts and Applications

Big Data for Reliability Engineering: Threat and Opportunity

Volatility Estimation of Stock Prices Using Garch Method

Fourier, Wavelet and Monte Carlo Methods in Computational Finance

A Python Library for Neuroimaging Based Machine Learning

BIOSTATS Documentation