Dual Blind Denoising Autoencoders for Industrial Process Data Filtering Saul´ Langarica, Student Member, IEEE, and Felipe Nu´Nez,˜ Member, IEEE

1 Dual Blind Denoising Autoencoders for Industrial Process Data Filtering Saul´ Langarica, Student Member, IEEE, and Felipe Nuńez,˜ Member, IEEE Abstract—In an industrial internet setting, ensuring trustwor- however, they require the selection of a suitable model and a- thiness of process data is a must when data-driven algorithms op- priori estimation of parameters, as the covariance matrices in erate in the upper layers of the control system. Unfortunately, the the Kalman filter, which are critical for a good performance. common place in an industrial setting is to find time series heavily corrupted by noise and outliers. Typical methods for cleaning A different approach for denoising is the use of transforms, the data include the use of smoothing filters or model-based like Wavelets [9] or Gabor [10], which exploit statistical observers. In this work, a purely data-driven learning-based properties of the noise so the signal can be thresholded in approach is proposed based on a combination of convolutional the transformed domain to preserve only the high-valued and recurrent neural networks, in an auto-encoder configuration. coefficients, and then, by applying the inverse transform, Results show that the proposed technique outperforms classical methods in both a simulated example and an application using obtain a cleaner signal. A limitation of these approaches is the real process data from an industrial facility. difficulty of knowing a priori the best basis for representing the signals, and without knowledge on the noise nature, as is Index Terms—Autoencoders, Process control. the case in real process data, is hard to determine where to threshold the transformed data. I. INTRODUCTION Learning-based denoising algorithms, like principal component analysis (PCA) [11], Kernel PCA [12] or dictionary The incorporation of industrial Internet of Things tech- learning [13], solve some of the problems that fixed transforms nologies to modern industrial facilities allows the real-time have, by learning a suitable representation of the data in a acquisition of an enormous amount of process data, typically transformed space. These approaches are also multivariate in in the form of time-series, which represents an opportunity nature, so the spatial correlations between signals is exploited; to improve performance by using data-driven algorithms for nevertheless, these algorithms were designed for static data, supervision, modeling and control [1]–[3]. e.g., images, and hence important information from temporal Data-driven techniques, such as statistical or machine learn- correlations are not exploited at all. ing algorithms, are capable of dealing with the multivariate Recently, denoising autoencoders (DAE) [14], [15] have and intricate nature of industrial processes; however, they rely emerged as a suitable learning-based denoising technique that on the consistency and integrity of the data to work properly is multivariate in nature and is capable of learning complex [3]. This imposes a limitation to online application of these nonlinear structures and relationships between variables. This algorithms in real facilities, since process data is often highly represents a great advantage over traditional learning-based corrupted with outliers and noise, caused by multiple factors techniques when dealing with highly nonlinear data. Orig- as environmental disturbances, human interventions, and faulty inally, DAEs emerged for image denoising, but the use of sensors. Consequently, there is a lack of data-driven appli- recurrent neural networks has allowed their application for cations operating online, and the vast majority of successful denoising dynamical data, such as audio and video [16], [17]. implementations use offline preprocessed data, simulations or However, unlike PCA, dictionary learning or fixed transforms generate the database in a controlled environment, as pilot- techniques, DAEs are not blind in the sense that for learning arXiv:2004.06806v1 [eess.SP] 14 Apr 2020 scale deployments in laboratories [4]. to denoise a signal, the clean version of the signal (the target) A typical approach for dealing with corrupted process data has to be known beforehand. In addition, information about is the use of smoothing filters, like simple discrete low- the characteristics of the noise affecting the process is required pass filters, the Savitsky-Golay (SG) filter [5] or exponential to create realistic training examples. This is a great limitation moving average filters (EMA) [6]. The main drawback of these for the use of DAEs in real-world applications where the clean techniques is their univariate nature, hence the redundancy version of the signal and the noise characteristics are unknown. and correlations among variables typically present in industrial In this work, we propose the use of a dual blind denoising processes are not made use of for denoising. Multivariate autoencoder (DBDAE) for multivariate time series denoising, denoisers, which can exploit cross-correlation between signals, that preserves all the advantages of DAEs and eliminates are a natural improvement to univariate filters. Approaches like the necessity of knowing beforehand the noise characteristics Kalman [7] or particle filters [8] are the flagship techniques; and the clean version of the signals. The network structure is designed to exploit both spatial and temporal correlations This work was supported by ANID under grant ANID PIA ACT192013. of the input data, by combining recurrent and convolutional S. Langarica and F. Nuńez˜ are with the Department of Electrical Engineer- ing, Pontificia Universidad Catolica´ de Chile, Av. Vicuna˜ Mackenna 4860, encoder networks. This dual encoding allows the network Santiago, Chile 7820436. E-mail: [email protected], [email protected]. to reconstruct missing or faulty signals, adding robustness 2 to online applications. Because of the predictive capabilities 1 g 1 y fθE θD y^ of the network, the phase delay is minimum compared to z traditional techniques like low-pass filters, which is a critical y2 y^2 advantage for real-time applications built on top, as feedback . controllers. Consequently, the main contributions of this work . are: i) the formulation of a new blind denoising technique yn−1 y^n−1 based on DAEs that eliminates the necessity of knowing a- priori information of the signals; and ii) a novel autoencoder yn y^n architecture that enables the network to exploit both temporal Fig. 1. Simple AE architecture where f encodes the input data to a latent and spatial correlations, which allows to reconstruct faulty θE representation z and then g decodes z to reconstruct the input. signals, in addition to the denoising capabilities. θD The rest of this paper is organized as follows. Preliminaries of data filtering in industrial processes are given in Section II. In Section III the basics of autoencoders and DAEs are where ^y(t) is the tth smoothed output of the EMA filter and presented. Denoising and reconstruction based on the proposed α 2 [0; 1] is the filter weight determining the importance given DBDAE is presented and applied to a simulated dynamical to the past output. system in Section IV. Section V shows an implementation of The SG filter outperforms common low-pass filters in pre- the DBDAE to a real industrial process. Finally, conclusions serving useful high-frequency information and prevention of and directions for future research are presented in Section VI. extra delays [18]. The SG filter calculates the smoothed output ^y(t) using a local discrete convolution over the 2M + 1 sam- ples sub-sequence y , where the convolution coeffi- II. PRELIMINARIES [t−M;t+M] cients are derived from a least-squares polynomial smoothing. A. Notation and Basic Definitions Because EMA and SG filters are the most used denoising In this work, R denotes the real numbers, Z≥0 the non- techniques in industry, they will be used as a baseline for negative integers, Rn the Euclidean space of dimension n, comparison with our DBDAE network. and Rn×m the set of n × m matrices with real coefficients. For a; b 2 Z≥0 we use [a; b] to denote their closed interval III. AUTOENCODERS in Z For a vector v 2 Rn, vi denotes its ith component. n×m A. Background For a matrix A 2 R , Ai denotes its ith column and Ai its ith row. For an n-dimensional real-valued sequence Autoencoders (AEs) [19] are unsupervised neural networks n trained for reconstructing their inputs at the output layer, α : Z≥0 ! R , α(t) denotes its tth element, and α[a;b] denotes its restriction to the interval [a; b], i.e., a sub-sequence. For passing through an intermediate layer normally of lower n×(b−a+1) dimension. AEs can be regarded as a nonlinear generalization a sub-sequence α[a;b], M(α[a;b]) 2 R is a matrix whose ith column is equal to α(a+i−1), with i 2 [1; b−a+1]. of PCA aiming to encode input data in an intermediate The same notation applies for an n-dimensional finite-length lower-dimensional representation, which preserves most of the sequence α : [0; T¯] ! Rn with the understanding that for information in the inputs. This intermediate representation is ¯ a sub-sequence α[a;b], [a; b] ⊆ [0; T ] must hold. Given an known as the latent space of the network. N-dimensional sequence α, we define its T -depth window Formally, for a given sequence y, the AE maps an input N×T y(t) 2 n z 2 m as a matrix-valued sequence β : Z≥0 ! R , where vector R to a latent representation R with β(t) = M(α[t−T +1;t]). m < n. This mapping is done by a function fθE , which in the simplest case is a linear layer with σ as an arbitrary activation function, namely, B. Typical approaches for process data filtering When dealing with real industrial time series, denoising z = fθE = σ(WθE y(t) + bθE ); (2) methods can be classified into model-based and data-driven.

Load more