Unsupervised Feature Learning from Time Series

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) Unsupervised Feature Learning from Time Series , Qin Zhang⇤, Jia Wu⇤, Hong Yang§, Yingjie Tian† ‡, Chengqi Zhang⇤ ⇤ Quantum Computation & Intelligent Systems Centre, University of Technology Sydney, Australia † Research Center on Fictitious Economy & Data Science, Chinese Academy of Sciences, Beijing, China. ‡ Key Lab of Big Data Mining & Knowledge Management, Chinese Academy of Sciences, Beijing, China. § MathWorks, Beijing, China qin.zhang@student.,jia.wu@, chengqi.zhang@ uts.edu.au, [email protected], [email protected] { } Abstract In this paper we study the problem of learning Shapelet 1 discriminative features (segments), often referred to as shapelets [Ye and Keogh, 2009] of time series, from unlabeled time series data. Discover- ing shapelets for time series classification has been widely studied, where many search-based algo- Shapelet 2 Shapelet 1 rithms are proposed to efficiently scan and select segments from a pool of candidates. However, such types of search-based algorithms may incur high Shapelet 2 time cost when the segment candidate pool is large. Alternatively, a recent work [Grabocka et al., 2014] uses regression learning to directly learn, instead Figure 1: The two blue thin lines represent the time series of searching for, shapelets from time series. Moti- data. The upper one is rectangular signals with noise and the vated by the above observations, we propose a new lower one is sinusoidal signals with noise. The curves marked Unsupervised Shapelet Learning Model (USLM) in bold red are learnt features(shapelets). We can observe that to efficiently learn shapelets from unlabeled time the learnt shapelets may differ from all candidate segments series data. The corresponding learning func- and thus are robust to noise. tion integrates the strengths of pseudo-class label, spectral analysis, shapelets regularization term and regularized least-squares to auto-learn shapelets, defined distance metric and the segments which best pre- pseudo-class labels and classification boundaries dict the class labels are selected as shapelets. Based on the simultaneously. A coordinate descent algorithm seminal work, a line of speed-up algorithms [Mueen et al., is used to iteratively solve the learning function. 2011][Rakthanmanon and Keogh, 2013][Chang et al., 2012] Experiments show that USLM outperforms search- have been proposed to improve the performance. All these based algorithms on real-world time series data. methods can be categorized as the search-based algorithms which scan a pool of candidate segments. For example, with the Synthetic Control dataset [Chen et al., 2015] which con- 1 Introduction tains 600 time series examples of length 60, the number of candidates for all lengths is 1.098 106. Time series classification has wide applications in fi- ⇥ nance [Ruiz et al., 2012], medicine [Hirano and Tsumoto, On the other hand, a recent work [Grabocka et al., 2014] 2006] and trajectory analysis [Cai and Ng, 2004]. The main proposes a new time series shapelet learning approach. In- challenge of time series classification is to find discriminative stead of searching for shapelets from a candidate pool, they features that can best predict class labels. To solve the chal- use regression learning and aim to learn shapelets from time lenge, a line of works have been proposed to extract discrimi- series. This way, shapelets are detached from candidate seg- native features, which are often referred to as shapelets, from ments and the learnt shapelets may differ from all the can- time series. Shapelets are maximally discriminative features didate segments. More importantly, shapelet learning is fast of time series which enjoy the merit of high prediction accu- to compute, scalable to large datasets, and robust to noise as racy and are easy to explain [Ye and Keogh, 2009]. There- shown in Fig. 1. fore, discovering shapelets has become an important branch We present a new Unsupervised Shapelet Learning Model in time series analysis. (USLM for short) that can auto-learn shapelets from unla- The seminal work on shapelet discovery [Ye and Keogh, beled time series data. We first introduce pseudo-class label 2009] resorts to a full-scan of all possible time series seg- to transform unsupervised learning to supervised learn- ments where the segments are ranked according to a pre- ing. Then, we use the popular regularized least-squares and 2322 spectral analysis approaches to learn both shapelets and clas- through regression learning. This type of learning method sification boundaries. A new regularization term is also added does not consider a limited set of candidates but can obtain to avoid similar shapelets being selected. A coordinate de- arbitrary shapelets. scent algorithm is used to iteratively solve the pseudo-class Unsupervised feature selection. Many unsupervised fea- label, classification boundary and shapelets. ture selection algorithms have been proposed to select infor- Compared to the search-based algorithms on unlabeled mative features from unlabeled data. A commonly used cri- time series data [Zakaria et al., 2012], our method provides a terion in unsupervised feature learning is to select features new tool that learns shapelets from unlabeled time series data. best preserving data similarity or manifold structure con- The contributions of the work are summarized as follows: structed from the whole feature space[Zhao and Liu, 2007] [ ] We present a new Unsupervised Shapelet Learning Cai et al., 2010 , but they fail to incorporate discrimina- • Model (USLM) to learn shapelets from unlabeled time tive information implied within data, which cannot be di- series data. USLM combines pseudo-class label, spec- rectly applied in our shapelet learning problem. Earlier un- tral analysis, shapelets regularization and regularized supervised feature selection algorithms evaluate the impor- least-squares for learning. tance of each feature individually and select features one by one [He et al., 2005][Zhao and Liu, 2007], with a limita- We empirically validate the performance of USLM on tion that correlation among features is neglected [Cai et al., • both synthetic and real-world data. The results show 2010]citezhao2010efficient˜ [Zhang et al., 2015]. promising results compared to the state-of-the-art unsu- State-of-the-art unsupervised feature selection algorithms pervised shapelets selection models. perform feature selection by simultaneously exploiting dis- The remainder of the paper is organized as follows: Sec- criminative information and feature correlation. Unsuper- tion 2 surveys related work. Section 3 gives the preliminar- vised Discriminative Feature Selection (UDFS) [Yang et al., ies. Section 4 introduces the proposed unsupervised shapelet 2011] aims to select the most discriminative features for data learning model USLM. Section 5 introduces the learning al- representation, where manifold structure is also considered. gorithm with analysis. Section 6 conducts experiments. We Since the most discriminative information for feature selec- conclude the paper in Section 7. tion is usually encoded in labels, it is very important to predict a good cluster indicators as pseudo labels for unsupervised feature selection. 2 Related Work Shapelets for clustering. Shapelets also have been uti- Shapelets [Ye and Keogh, 2009] are time series short seg- lized to cluster time series [Zakaria et al., 2012]. Zakaria ments that can best predict class labels. The basic idea of et al. [Zakaria et al., 2012] have proposed a method to use shapelets discovery is to consider all segments of training data unsupervised-shapelets (u-Shapelets) for time series cluster- and assess them regarding a scoring function to estimate how ing. The algorithm searches for u-Shapelets which can sep- predictive they are with respect to the given class labels [Wis- arate and remove a subset of time series from the rest of the tuba et al., 2015]. The seminal work [Ye and Keogh, 2009] dataset, then it iteratively repeats the search among the re- builds a decision tree classifier by recursively searching for maining data until no data remains to be separated. It is a informative shapelets measured by information gain. Based greedy search algorithm which attempts to maximize the gap on information gain, several new measures such as F-Stat, between the two groups of time series divided by a u-shapelet. Kruskall-Wallis and Mood’s median are used in shapelets se- The k-shape algorithm is proposed in the work [Paparri- lection [Hills et al., 2014][Lines et al., 2012]. zos and Gravano, 2015] to cluster time series. k-shape is a Since time series data usually have a large number of can- novel algorithm for shape-based time series clustering that didate segments, the runtime of brute-force shapelets selec- is efficient and domain independent. k-shape is based on a tion is infeasible. Therefore, a series of speed-up techniques scalable iterative refinement procedure which creates homo- have been proposed. On the one hand, there are smart im- geneous and well-separated clusters. Specifically, k-Shape plementations using early abandon of distance computations requires a distance measure that is invariant to scaling and and entropy pruning of the information gain heuristic [Ye shifting. It uses a normalized version of the cross-correlation and Keogh, 2009]. On the other hand, many speed-ups measure as distance measure to consider the shapes of time rely on the reuse of computations and pruning of the search series. Based on the normalized cross-correlation, the method space [Mueen et al., 2011], as well as pruning candidates by computes cluster centroids in every iteration to update the as- searching possibly interesting candidates on the SAX repre- signment of time series to clusters. sentation [Rakthanmanon and Keogh, 2013] or using infre- Our work differs from the above research problems. We quent shapelets [He et al., 2012]. Shapelets have been applied introduce a new approach for unsupervised shapelet learning in a series of real-world applications. to auto-learn shapelets from unlabeled time series by com- Shapelet learning.

Unsupervised Feature Learning from Time Series

Automatic Feature Learning Using Recurrent Neural Networks

Self-Discriminative Learning for Unsupervised Document Embedding

Exploring the Potential of Sparse Coding for Machine Learning

Q-Learning in Continuous State and Action Spaces

Text-To-Speech Synthesis

Reinforcement Learning Data Science Africa 2018 Abuja, Nigeria (12 Nov - 16 Nov 2018)

Unsupervised Speech Representation Learning Using Wavenet Autoencoders Jan Chorowski, Ron J

Sparse Feature Learning for Deep Belief Networks

A Review of Unsupervised Artificial Neural Networks with Applications

A Hybrid Model Consisting of Supervised and Unsupervised Learning for Landslide Susceptibility Mapping

4 Perceptron Learning

Unsupervised Pre-Training of a Deep LSTM-Based Stacked Autoencoder for Multivariate Time Series Forecasting Problems Alaa Sagheer 1,2,3* & Mostafa Kotb2,3