Adversarial Dynamic Shapelet Networks
Total Page:16
File Type:pdf, Size:1020Kb
The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) Adversarial Dynamic Shapelet Networks Qianli Ma,1 Wanqing Zhuang,1 Sen Li,1 Desen Huang,1 Garrison W. Cottrell2 1School of Computer Science and Engineering, South China University of Technology, Guangzhou, China 2Department of Computer Science and Engineering, University of California, San Diego, CA, USA [email protected], [email protected] Abstract from different categories are often distinguished by their local pattern rather than by global structure, shapelets are Shapelets are discriminative subsequences for time series an effective approach to this problem. Moreover, shapelet- classification. Recently, learning time-series shapelets (LTS) was proposed to learn shapelets by gradient descent directly. based methods can provide interpretable decisions (Ye and Although learning-based shapelet methods achieve better re- Keogh 2009) since the important local patterns found from sults than previous methods, they still have two shortcomings. original subsequences are used to identify the category of First, the learned shapelets are fixed after training and cannot time series. adapt to time series with deformations at the testing phase. The original shapelet-based algorithm constructs a deci- Second, the shapelets learned by back-propagation may not sion tree classifier by recursively searching the best shapelet be similar to any real subsequences, which is contrary to the original intention of shapelets and reduces model inter- data split (Ye and Keogh 2009). The candidate subsequences pretability. In this paper, we propose a novel shapelet learning are assessed by information gain, and the best shapelet is model called Adversarial Dynamic Shapelet Networks (AD- found at each node of the tree through enumerating all candi- SNs). An adversarial training strategy is employed to pre- dates. Lines et al. (Lines et al. 2012) proposed an algorithm vent the generated shapelets from diverging from the actual called shapelet transformation that finds the top k shapelets subsequences of a time series. During inference, a shapelet in a single pass and then uses the shapelets to transform the generator produces sample-specific shapelets, and a dynamic original time series, where each attribute of the new repre- shapelet transformation uses the generated shapelets to ex- sentation is the distance between the original series and one tract discriminative features. Thus, ADSN can dynamically of the shapelets. The transformed data can be used in con- generate shapelets that are similar to the real subsequences junction with any sort of classifier. Thus it can achieve better rather than having arbitrary shapes. The proposed model has high modeling flexibility while retaining the interpretability performance while simultaneously reducing the run time. of shapelet-based methods. Experiments conducted on exten- Recently, Grabocka et al. (Grabocka et al. 2014) proposed sive time series data sets show that ADSN is state-of-the-art a novel shapelet discovery approach called learning time- compared to existing shapelet-based methods. The visualiza- series shapelets (LTS), which uses gradient descent to learn tion analysis also shows the effectiveness of dynamic shapelet shapelets directly rather than searching over a large num- generation and adversarial training. ber of candidates. Since LTS learns the shapelets jointly with the classifier, it further improves the classification ac- Introduction curacy. Motivated by LTS, other learning-based shapelet methods (Shah et al. 2016; Zhang et al. 2016) have been Time series classification is a problem where a set of time proposed. However, although these learning-based meth- series from different categories are given to an algorithm ods achieve better performance than previous methods, they that learns to map them to their corresponding categories. still have two main shortcomings. First, since the shapelets Such problems are ubiquitous in daily life, including cat- learned with these methods are fixed after training, they can- egorizing financial records, weather data, electronic health not deal with novel deformations of local patterns, which records, and so on. In recent years, a novel approach called will result in the failure to recognize important patterns shapelets (Ye and Keogh 2009) has attracted a great deal of at test time. Second, the original shapelet-based methods attention in this domain. Shapelets are discriminative subse- provide interpretability because the shapelets are based on quences of time series, and have been successfully applied to discriminative subsequences. However, there is no con- time series classification tasks. The category of a time series straint for existing learning-based shapelet methods that is distinguished by the presence or absence of one or more the shapelets should resemble subsequences, which reduces shapelets somewhere in the whole series. Since time series their interpretability. Learning-based methods can have ar- Copyright c 2020, Association for the Advancement of Artificial bitrary shapes and even diverge from the real subsequences, Intelligence (www.aaai.org). All rights reserved. contrary to the original intent of shapelet methods. 5069 In this paper, we propose a novel model called Adversar- process. Grabocka et al. (Grabocka, Wistuba, and Schmidt- ial Dynamic Shapelet Networks (ADSNs). Inspired by Jia et Thieme 2015) proposed pruning the shapelets that are sim- al. (Jia et al. 2016) who proposed Dynamic Filter Networks ilar, and using a supervised selection process to choose (DFNs) that generate the convolutional filters conditioned on shapelets based on how well they improve classification ac- input data, ADSNs use a shapelet generator to generate a set curacy. Hou et al. (Hou, Kwok, and Zurada 2016) proposed of shapelets conditioned on subsequences of the input time a novel shapelet discovery approach that learns the shapelet series. Here, although the parameters of the shapelet gener- positions by using a generalized eigenvector method, and ator are fixed after training, the shapelet generator can gen- a fused lasso regularizer to get a sparse and “blocky” so- erate different shapelets according to different input sam- lution. This approach treats the shapelet discovery task as ples at test phase. The time series is then transformed by the a numerical optimization problem and is faster than previ- sample-specific shapelets, and the new representations are ous shapelet-based methods. Although the above-mentioned passed through a softmax layer to calculate the final prob- methods improved computational efficiency, there is still ability distribution for each label. We model the shapelet room for improving accuracy. generating process as a two-player minimax game following Shapelet Transformation. Instead of embedding the Generative Adversarial Network (GAN) (Goodfellow et shapelet discovery within a decision tree classifier, the al. 2014) approach. Specifically, a discriminator is trained Shapelet Transformation (ST) algorithm finds the most to distinguish the generated shapelets from the actual sub- discriminative subsequences as shapelets in a single pass sequences of the input. Therefore, the generated shapelets through the data (Lines et al. 2012). The shapelets are used will be similar to the subsequences of the input time se- to transform the time series data into new representations, in ries. Moreover, we add a shapelet diversity regularization which each attribute is the distance of a time series to one of term (Zhang et al. 2018) to the objective function to increase the shapelets. the diversity of the generated shapelets and avoid mode col- Grabocka et al. (Grabocka et al. 2014) proposed using lapse (Salimans et al. 2016). Our contributions can be sum- gradient descent to learn the top k shapelets directly, rather marized as follows: than searching among candidate subsequences. A shapelet • We propose a shapelet generator to dynamically produce transformation is then applied to the time series. Shah et shapelets that are sample-specific, improving the model- al. (Shah et al. 2016) also used gradient descent to learn ing flexibility and classification performance. shapelets, but replaced the Euclidean distance measure with • Dynamic Time Warping (DTW) distance (Berndt and Clif- In order to prevent the generated shapelets from creating ford 1994). Zhang et al. (Zhang et al. 2016) used learned arbitrary shapes, an adversarial training strategy is em- shapelets for time series clustering task, using the shapelet ployed to ensure the generated shapelets are similar to the transformation before clustering. actual subsequences of the time series. To the best of our Whether obtaining shapelets by searching or learning, knowledge, this is the first work that uses an adversarial these shapelet-based methods use fixed shapelets after train- training strategy to learn shapelets. ing, and thus cannot be adapted to time series with defor- • Our experimental results on a large number of time series mations, reducing the modeling flexibility. In contrast, we datasets show that the proposed model achieves state-of- propose a sample-specific shapelet learning model that dy- the-art performance and the effectiveness of our model is namically generates the shapelets according to the input se- demonstrated through visualization analysis. ries, and use an adversarial training strategy to constrain the shapelets to be similar to the actual subsequences. Related Work The original shapelet classifier has two major limitations.