Analysis of Process Data with Singular Spectrum Methods

ANALYSIS OF PROCESS DATA WITH SINGULAR SPECTRUM METHODS by Marlize Barkhuizen Thesis presented in partial fulfilment of the requirements for the Degree of Masters of Science in Engineering (Chemical Engineering) in the Department of Process Engineering at the University of Stellenbosch Supervised by: Prof C Aldrich STELLENBOSCH SOUTH AFRICA DECEMBER 2003 DECLARATION I, the undersigned, hereby declare that the work contained in this thesis is my own original work and that I have not previously in its entirety or in part submitted it at any university for a degree. Marlize Barkhuizen October 2003 SYNOPSIS The analysis of process data obtained from chemical and metallurgical engineering systems is a crucial aspect of the operating of any process, as information extracted from the data is used for control purposes, decision making and forecasting. Singular spectrum analysis (SSA) is a relatively new technique that can be used to decompose time series into their constituent components, after which a variety of further analyses can be applied to the data. The objectives of this study were to investigate the abilities of SSA regarding the filtering of data and the subsequent modelling of the filtered data, to explore the methods available to perform nonlinear SSA and finally to explore the possibilities of Monte Carlo SSA to characterize and identify process systems from observed time series data. Although the literature indicated the widespread application of SSA in other research fields, no previous application of singular spectrum analysis to time series obtained from chemical engineering processes could be found. SSA appeared to have a multitude of applications that could be of great benefit in the analysis of data from process systems. The first indication of this was in the filtering or noise-removal abilities of SSA. A number of case studies were filtered by various techniques related to SSA, after which a number of neural network modelling strategies were applied to the data. It was consistently found that the models built on data that have been prefiltered with SSA outperformed the other models. The effectiveness of localized SSA and auto-associative neural networks in performing nonlinear SSA were compared. Both techniques succeeded in extracting a number of nonlinear components from the data that could not be identified from linear SSA. However, it was found that localized SSA was a more reliable approach, as the auto-associative neural networks would not train for some of the data or extracted nonsensical components for other series. Lastly a number of time series were analysed using Monte Carlo SSA. It was found that, as is the case with all other characterization techniques, Monte Carlo SSA could not succeed in correctly classifying all the series investigated. For this reason several tests were used for the classification of the real process data. In the light of these findings, it was concluded that singular spectrum analysis could be a valuable tool in the analysis of chemical and metallurgical process data. i OPSOMMING Die analise van chemise en metallurgiese prosesdata wat verkry is vanaf chemiese of metallurgiese ingenieursstelsels is ‘n baie belangrike aspek in die bedryf van enige proses, aangesien die inligting wat van die data onttrek word vir prosesbeheer, besluitneming of die bou van prosesmodelle gebruik kan word. Singuliere spektrale analise is ‘n relatief nuwe tegniek wat gebruik kan word om tydreekse in hul onderliggende komponente te ontbind. Die doelwitte van hierdie studie was om ‘n omvattende literatuuroorsig oor die ontwikkeling van die tegniek en die toepassing daarvan te doen, beide in die ingenieursindustrie en in ander navorsingsvelde, die navors van die moontlikhede van SSA aangaande die verwydering van geraas uit die data en die gevolglike modellering van die skoon data te ondersoek, ‘n ondersoek te doen na sommige van die beskikbare tegnieke vir nie-lineêre SSA en laastens ‘n studie te maak van die potensiaal van Monte Carlo SSA vir die karakterisering en identifikasie van data verkry vanaf prosesstelsels. Ten spyte van aanduidings in die literatuur dat SSA wydverspreid toegepas word in ander navorsingsvelde, kon geen vorige toepassings gevind word van SSA op chemiese prosesse nie. Dit wil voorkom asof die chemiese nywerhede groot baat kan vind by SSA van prosesdata. Die eerste aanduiding van hierdie voordele was in die vermoë van SSA om geraas te verwyder uit tydreekse. ‘n Aantal tipiese gevalle is ondersoek deur van verskeie benaderings tot SSA gebruik te maak. Nadat die geraas uit die tydreekse van die toetsgevalle verwyder is, is neurale netwerke gebruik om die prosesse te modelleer. Daar is herhaaldelik gevind dat die modelle wat gebou is op data wat eers deur SSA skoongemaak is, beter presteer as die wat slegs op die onverwerkte data gepas is. Die effektiwiteit van lokale SSA en auto-assosiatiewe neurale netwerke om nie- lineêre SSA toe te pas is ook vergelyk. Albei tegnieke het daarin geslaag om nie- lineêre hoofkomponente van die data te onttrek wat nie geïdentifiseer kon word deur die lineêre benadering nie. Daar is egter gevind dat lokale SSA ‘n meer betroubare tegniek is, aangesien die auto- assosiatiewe neurale netwerke nie op sommige van die datastelle wou leer nie en vir ander tydreekse sinnelose hoofkomponente onttrek het. Laastens is ‘n aantal tydreekse geanaliseer met behulp van Monte Carlo SSA. Soos met alle ander karakteriseringstegnieke, kon Monte Carlo SSA nie daarin slaag om al die tydreekse wat ondersoek is korrek te identifiseer nie. Om hierdie rede is ‘n kombinasie van toetse gebruik om die onbekende tydreekse te klassifiseer. In die lig van al hierdie bevindinge, is die gevolgtrekking gemaak dat singuliere spektrale analise ‘n waardevolle hulpmiddel kan wees in die analise van chemiese en metallurgiese prosesdata. ii This is not the end. It is not even the beginning of the end. But it is the end of the beginning. Winston Churchill iii ACKNOWLEDGEMENTS My heavenly Father for giving me the abilities, strength and opportunities to come this far. My study leader, Prof Chris Aldrich, for his patient guidance, continued support and invaluable input in my work. My parents for believing in me and always having a word of support, encouragement and praise, as the need arose. My friends for showing interest, giving advice and ‘just being there’, even though my work was largely Greek to them. Fred for promising everything would turn out all right (it did). Mintek for their continued financial support and interest in my work. Acknowledgements iv TABLE OF CONTENTS TABLE OF CONTENTS v LIST OF FIGURES viii LIST OF TABLES xv 1. INTRODUCTION 1 2. LITERATURE REVIEW 3 2.1 ORIGINS OF SINGULAR SPECTRUM ANALYSIS ........................................................................... 3 2.2 SSA COMPARED TO OTHER SPECTRAL TIME SERIES ANALYSIS TECHNIQUES ............................. 4 2.2.1 Fourier analysis 4 2.2.2 Wavelet analysis 4 2.2.3 Maximum Entropy Method (MEM) 4 2.2.4 Multi-taper method (MTM) 5 2.3 TRENDS ANALYSIS WITH SSA................................................................................................. 5 2.3.1 Climatology and paleoclimatology 5 2.3.2 Biosciences 8 2.3.3 Economics 10 2.3.4 Geophysics 11 2.3.5 Engineering applications 12 2.3.6 Solar physics 14 2.4 OTHER APPLICATIONS OF SSA............................................................................................. 15 2.4.1 Process modelling and forecasting 15 2.4.2 Change point detection 16 3. METHODOLOGY 18 3.1 BASIC SINGULAR SPECTRUM ANALYSIS ........................................................................... 18 3.1.1 Decomposition stage 19 3.1.2 Reconstruction stage 22 3.1.3 Literature review on embedding and window length 23 3.1.4 Noise reduction 24 3.1.5 Interpretation of eigenelements 25 3.1.6 Prediction 25 3.2 MULTICHANNEL SINGULAR SPECTRUM ANALYSIS ............................................................. 25 3.3 MONTE CARLO SSA........................................................................................................ 26 3.3.1 Monte Carlo in the literature 28 3.3.2 General approach to Monte Carlo SSA 30 3.3.3 Generation of surrogate data 31 3.3.4 Choosing a test statistic 32 3.3.5 Classes of hypothesis 33 Table of Contents v 3.4 NONLINEAR SINGULAR SPECTRUM ANALYSIS .................................................................... 33 3.4.1 Nonlinear Principal Component Analysis 33 3.4.2 Localized Principal Component Analysis 36 3.5 OTHER APPROACHES TO SSA.......................................................................................... 38 3.5.1 Multi-scale singular spectrum analysis 38 3.5.2 Random lag singular spectrum analysis 38 3.5.3 Approximate projectors 38 4. FILTERING OF DATA WITH SSA 39 4.1 IDENTIFICATION OF PROCESS SYSTEM ................................................................................... 39 4.2 FLOW BETWEEN TWO NONINTERACTING TANKS IN SERIES ...................................................... 42 4.2.1 Generation of series and performing SSA 42 4.2.2 Modelling of data series 44 4.2.3 Influence of window length 45 4.2.4 Reconstructed attractor 47 4.3 CARBON-IN-LEACH PROCESS................................................................................................ 48 4.3.1 Background 48 4.3.2 Grouping of eigenvalues 50 4.3.3 Filtering of data 52 4.4 BASE METAL FLOTATION PLANT – ROUGHER, CLEANER AND SCAVENGER CIRCUITS ................. 54 4.4.1 Background 54 4.4.2 Embedding of plant data 55 4.4.3 Modelling of the cleaner,

Load more