Master Thesis

Master thesis Unsupervised segmentation of biosignal-based time-series Knut Joar Strømmen Robotics and Intelligent Systems 60 credits Department of Informatics Faculty of mathematics and natural sciences UNIVERSITY OF OSLO Spring 2020 1 Thesis Unsupervised segmentation of biosignal-based time-series Knut Joar Strømmen 2 3 Abstract The rise of the Internet of Things (IoT) and the development of more compact and less power- hungry sensors have led to an increasing amount of data from various modalities. To analyze a large amount of continuously recorded time series data at scale, it is often beneficial to separate the data into homogenous segments before further analysis or classification. One common way to do time series segmentation is to find the transitions that separate the different segments. These transitions are in the literature referred to as changepoints, and the problem of discovering them is referred to as change point detection (CPD). Long time series data is hard and time-consuming to label; thus, having a CPD algorithm that works in an unsupervised fashion is highly beneficial. This thesis provides an overview of automatic change point detection, time series segmentation, and previous research concerning representation learning in the context of CPD. The main contribution of this thesis is the Latent Space Unsupervised Semantic Segmentation (LS-USS) model, which is an unsupervised segmentation algorithm that is hyperparameter lite, domain agnostic, and works well with multidimensional data. This model utilizes an autoencoder for extracting constrained feature encodings from time-series data before using the similarity between neighboring feature encodings in time to do changepoint detection and segmentation. The LS-USS model is compared with other state-of-the-art algorithms on multiple datasets. The results show that the model performs similarly or better than other state-of-the-art models. To demonstrate a practical application of LS-USS, a dataset containing activity data from non-depressed and depressed participants is used. Using the LS-USS algorithm as a preprocessing step before classification significantly outperforms the previous best classification results on this dataset. 4 Table of Contents Abstract ...................................................................................................................................... 4 Preface ...................................................................................................................................... 14 1 Introduction ...................................................................................................................... 16 1.1 Applications of Change Point Detection ................................................................... 16 1.2 Mental Health Monitoring Systems ........................................................................... 17 1.3 Summary of the Work ............................................................................................... 18 1.4 Chapters ..................................................................................................................... 20 1.4.1 Background ........................................................................................................ 20 1.4.2 Methods .............................................................................................................. 21 1.4.3 Datasets .............................................................................................................. 21 1.4.4 Experiments ........................................................................................................ 21 2 Background ...................................................................................................................... 22 2.1 Time series change point detection ........................................................................... 22 2.1.1 Wanted Features ................................................................................................. 22 2.1.2 Supervised CPD methods ................................................................................... 23 2.1.3 Unsupervised CPD methods ............................................................................... 23 2.2 Matrix profiles ........................................................................................................... 27 2.2.1 Mueen’s algorithm for similarity search ............................................................ 29 2.2.2 Scalable Time-series Anytime Matrix Profile .................................................... 31 2.3 Fast Low-cost Unipotent Semantic Segmentation ..................................................... 33 2.3.1 Region Extraction Algorithm ............................................................................. 35 2.3.2 Fast Low-cost Online Semantic Segmentation .................................................. 36 2.3.3 Locality – Temporal constraint .......................................................................... 37 2.3.4 Multidimensional time series data ...................................................................... 38 2.4 Representation Learning ............................................................................................ 39 2.4.1 Representation learning for change point detection ........................................... 40 2.4.2 Time Series Segmentation through Automatic Feature Learning ...................... 42 2.5 Evaluation Metrics ..................................................................................................... 43 2.6 Spectral density estimation ........................................................................................ 47 2.6.1 Periodograms ...................................................................................................... 47 2.6.2 Windowed Periodograms ................................................................................... 48 2.6.3 Welch’s method .................................................................................................. 49 5 3 Methods ............................................................................................................................ 50 3.1 Data Scalers ............................................................................................................... 50 3.1.1 NoScaler ............................................................................................................. 51 3.1.2 StandardScaler .................................................................................................... 51 3.1.3 MinMaxScaler .................................................................................................... 51 3.1.4 RobustScaler ....................................................................................................... 51 3.2 Change point detection algorithms ............................................................................ 52 3.2.1 Components ........................................................................................................ 52 3.2.2 Latent Space Unsupervised Semantic Segmentation ......................................... 62 3.2.3 FLUSS and FLOSS Implementation .................................................................. 62 3.2.4 TSSTAFL Implementation ................................................................................. 63 3.2.5 FLUSS-Welch .................................................................................................... 63 3.3 Change Point Extraction Algorithms ......................................................................... 63 3.3.1 Regime Extraction Algorithm Implementation .................................................. 64 3.3.2 Local Regime Extractor Algorithm .................................................................... 64 3.3.3 Local Threshold Extraction Algorithm .............................................................. 65 4 Datasets ............................................................................................................................ 68 4.1 UCI Human Activity Recognition Using Smartphones Dataset ............................... 68 4.2 Expressive Motion With Dancers .............................................................................. 69 4.3 Long-Term 3DC Dataset ........................................................................................... 70 4.4 Depresjon ................................................................................................................... 72 4.5 UCR Time Series Semantic Segmentation Archive .................................................. 73 5 Experiments ...................................................................................................................... 76 5.1 Comparisons Done on Long Multidimensional Data ................................................ 76 5.1.1 Experiments ........................................................................................................ 78 5.1.2 Discussion .......................................................................................................... 90 5.2 Improving Depression Detection Using Natural Segments ....................................... 93 5.2.1 Experiments ........................................................................................................ 93 5.2.2 Discussion .......................................................................................................... 96 5.3 Automatic subsequence

Load more