Pattern Recognition in Multivariate Time Series: Towards an Automated Event Detection Method for Smart Manufacturing Systems

Journal of Manufacturing and Materials Processing Article Pattern Recognition in Multivariate Time Series: Towards an Automated Event Detection Method for Smart Manufacturing Systems Vadim Kapp 1,2, Marvin Carl May 1 , Gisela Lanza 1 and Thorsten Wuest 2,* 1 Wbk Institute of Production Science, Karlsruhe Institute of Technology, Kaiserstr. 12, 76131 Karlsruhe, Germany; [email protected] (V.K.); [email protected] (M.C.M.); [email protected] (G.L.) 2 Benjamin M. Statler College of Engineering and Mineral Resource, West Virginia University, Morgantown, WV 26506, USA * Correspondence: [email protected]; Tel.: +1-304-293-9439 Received: 4 August 2020; Accepted: 2 September 2020; Published: 5 September 2020 Abstract: This paper presents a framework to utilize multivariate time series data to automatically identify reoccurring events, e.g., resembling failure patterns in real-world manufacturing data by combining selected data mining techniques. The use case revolves around the auxiliary polymer manufacturing process of drying and feeding plastic granulate to extrusion or injection molding machines. The overall framework presented in this paper includes a comparison of two different approaches towards the identification of unique patterns in the real-world industrial data set. The first approach uses a subsequent heuristic segmentation and clustering approach, the second branch features a collaborative method with a built-in time dependency structure at its core (TICC). Both alternatives have been facilitated by a standard principle component analysis PCA (feature fusion) and a hyperparameter optimization (TPE) approach. The performance of the corresponding approaches was evaluated through established and commonly accepted metrics in the field of (unsupervised) machine learning. The results suggest the existence of several common failure sources (patterns) for the machine. Insights such as these automatically detected events can be harnessed to develop an advanced monitoring method to predict upcoming failures, ultimately reducing unplanned machine downtime in the future. Keywords: smart manufacturing; Industry 4.0; polymer processing; polymer manufacturing; smart maintenance; unsupervised learning; segmentation; clustering; time series analysis 1. Introduction The manufacturing industry is currently in the midst of the fourth industrial revolution. This digital transformation towards smart manufacturing systems (SMS) is based on the three pillars connectivity, virtualization, and data utilization [1]. This circumstance is fueled by the rapid development in both information technology (IT) and operational technology (OT), which has led to an increasingly connected and automated world: in essence, a merging of both worlds (IT/OT). Technologies like sensor technology, cloud computing, and AI/big data analytics lead not only to a dramatic increase in the amount of (manufacturing) data, but also to rapidly developing data processing capabilities, which have raised interest in data mining approaches for automating repetitive processes [2]. Time series data, thereby, are one of the most common information representation variants in a variety of different business areas. Advanced process monitoring, on which we rely regularly in SMS, typically yields multidimensional data to increase its effectiveness. A specific branch in time series analysis deals with the recognition of reoccurring patterns within the data (see Figure1). Time series data analysis generally J. Manuf. Mater. Process. 2020, 4, 88; doi:10.3390/jmmp4030088 www.mdpi.com/journal/jmmp J. Manuf. Mater. Process. 2020, 4, x FOR PEER REVIEW 2 of 19 methods in order to identify the steady state, anomalies and characteristic failure patterns. However, the identification of such distinct patterns in multivariate time series represents a highly complex problem due to the interactions (correlation) between variables, time dependency and the usually nonlinear nature of the regarded data [3,4]. Furthermore, real-word data are usually noisy and, thus, require pre-processing to become a valuable input for analytical algorithms. Pre-processing enhances the ability of the overall approach to be flexible enough to detect various disturbances, such as missing values, noise or outliers, within the data set, but also sufficiently restrictive so that not all insignificant fluctuations (e.g., measurement errors) are labeled as irregularities. Standard distance- based similarity metrics, which are often used in related unsupervised learning approaches, can J.there Manuf.fore Mater. not Process.be applied2020, 4 ,without 88 adaptation for different types of problems [4,5]. 2 of 20 There is a growing interest in the field of pattern recognition, especially for multivariate time utilizesseries data established due to, on pattern the one recognition hand, increasing methods availability in order toof identifypotent algorithm the steadys and state, easy anomalies-to-use tools, and characteristicand on the other failure hand patterns., the realization However, of the the identification potential valuable of such impact distinct of patterns insights in derived multivariate from such time seriesdata sets. represents Most of a the highly recent complex work on problem clustering, due however, to the interactions focuses on (correlation) grouping different between time variables, series timeinto dependencysimilar batches and [6,7] the. usually The paper nonlinear by [8] nature for instance of the regarded presents dataa popular [3,4]. Furthermore, approach using real-word fuzzy dataclustering are usually combined noisy with and, a thus,dynamic require time pre-processing warping technique to become resulting a valuable in an enhance input forperformance analytical algorithms.when compared Pre-processing to previous enhances methods. the There ability are of only the overall a few authors approach focusing to be flexible on pattern enough recognit to detection variouswithin a disturbances, single time series, such as such missing as we values, regularly noise are or confronted outliers, within in maintenance the data set, applications but also suffi [4]ciently. restrictiveThis paper so that presents not all insignificant a framework fluctuations to utilize (e.g.,multivariate measurement data to errors) identify are reoccurring labeled as irregularities. patterns in Standardreal-world distance-based manufacturing similarity data. The metrics, objective which is to areidenti oftenfy failure used in patterns related for unsupervised new applications learning in approaches,the area of maintenance can therefore. not be applied without adaptation for different types of problems [4,5]. Figure 1. Abstract schematic to illustrate different patterns in a time series. Figure 1. Abstract schematic to illustrate different patterns in a time series. There is a growing interest in the field of pattern recognition, especially for multivariate time seriesW datae analyze due to, the on drying the one process hand, increasingof plastic granulate availability in an of industrial potent algorithms drying hopper and easy-to-use equipped tools, with andmultiple on the sensors. other hand, The thenumber realization of different of the potential failure patterns valuable impact(sources of for insights defects) derived is not from known such databeforehand sets. Most and of the the sensor recent readings work on clustering,are subject however, to natural focuses fluctuations on grouping (noise). di ffTheerent overall time serieswork intopresented similar in batches this manuscript [6,7]. The paperincludes by a [8 ]comparison for instance of presents two different a popular approa approachches towards using fuzzy the clusteringidentification combined of unique with patterns a dynamic in the time data warping set. One technique processing resulting path includes in an enhance the sequential performance use of whencommon compared segmentation to previous and clustering methods. Therealgorithms are only to identify a few authors patterns, focusing which onmight pattern lead recognition to a better withinrespective a single (both time steps series, of the such chain) as and we regularly, therefore, are overall confronted performance. in maintenance The second applications approach[ features4]. a collaborativeThis paper method presents with a framework a built-in totime utilize dependency multivariate structure, data tothereby identify avoiding reoccurring a multiplication patterns in real-worldof losses due manufacturing to a sequential data. processing The objective chain. isThe to identifybetter performing failure patterns method for is new fine applications-tuned afterwards in the areain terms of maintenance. of its hyperparameters. The resulting patterns (clusters) that are identified by the framework can then serve as input for advanced monitoring methods predicting upcoming failures and We analyze the drying process of plastic granulate in an industrial drying hopper equipped with ultimately reducing unplanned machine downtime in the future. multiple sensors. The number of different failure patterns (sources for defects) is not known beforehand The outline of this paper features a short introduction to the topic of plastic granulate drying and the sensor readings are subject to natural fluctuations (noise). The overall work presented in this and time series clustering in Section 2, followed by the proposed framework in Section 3. Finally, the manuscript includes a comparison of two different approaches towards the identification of unique

Load more