Recurrent Gaussian Processes and Robust Dynamical Modeling

UNIVERSIDADE FEDERAL DO CEARA´ CENTRO DE TECNOLOGIA DEPARTAMENTO DE ENGENHARIA DE TELEINFORMATICA´ PROGRAMA DE POS-GRADUA¸C´ AO~ EM ENGENHARIA DE TELEINFORMATICA´ DOUTORADO EM ENGENHARIA DE TELEINFORMATICA´ CESAR´ LINCOLN CAVALCANTE MATTOS RECURRENT GAUSSIAN PROCESSES AND ROBUST DYNAMICAL MODELING FORTALEZA 2017 CESAR´ LINCOLN CAVALCANTE MATTOS RECURRENT GAUSSIAN PROCESSES AND ROBUST DYNAMICAL MODELING Tese apresentada ao Curso de Doutorado em Engenharia de Teleinformática do Programa de Pós-Gradua¸c~aoem Engenharia de Teleinformática do Centro de Tecnologia da Universidade Federal do Ceará, como requisito parcial à obten¸c~ao do t´ıtulo de doutor em Engenharia de Teleinformática. Area´ de Concentra¸c~ao: Sinais e sistemas Orientador: Prof. Dr. Guilherme de Alencar Barreto FORTALEZA 2017 Dados Internacionais de Catalogação na Publicação Universidade Federal do Ceará Biblioteca Universitária Gerada automaticamente pelo módulo Catalog, mediante os dados fornecidos pelo(a) autor(a) M39r Mattos, César Lincoln. Recurrent Gaussian Processes and Robust Dynamical Modeling / César Lincoln Mattos. – 2017. 189 f. : il. color. Tese (doutorado) – Universidade Federal do Ceará, Centro de Tecnologia, Programa de Pós-Graduação em Engenharia de Teleinformática, Fortaleza, 2017. Orientação: Prof. Dr. Guilherme de Alencar Barreto. 1. Processos Gaussianos. 2. Modelagem dinâmica. 3. Identificação de sistemas não-lineares. 4. Aprendizagem robusta. 5. Aprendizagem estocástica. I. Título. CDD 621.38 CESAR´ LINCOLN CAVALCANTE MATTOS RECURRENT GAUSSIAN PROCESSES AND ROBUST DYNAMICAL MODELING A thesis presented to the PhD course in Teleinformatics Engineering of the Graduate Program on Teleinformatics Engineering of the Center of Technology at Federal University of Cearáin fulfillment of the the requirement for the degree of Doctor of Philosophy in Teleinformatics Engineering. Area of Concentration: Signals and systems Approved on: August 25, 2017 EXAMINING COMMITTEE Prof. Dr. Guilherme de Alencar Barreto (Supervisor) Universidade Federal do Ceará(UFC) Prof. Dr. Antônio Marcos Nogueira Lima Universidade Federal de Campina Grande (UFCG) Prof. Dr. Oswaldo Luiz do Valle Costa Universidade de S~aoPaulo (USP) Prof. Dr. JoséAilton Alencar Andrade Universidade Federal do Ceará(UFC) For my beloved family who always supports me: Fernando Lincoln, Carmen and Fer- nanda. ACKNOWLEDGEMENTS I would like to thank my supervisor, Prof. Guilherme de Alencar Barreto, for his long-time support, partnership and academic guidance. Guilherme knows a fine balance between a focused supervision and the freedom to explore and experiment. His patience, encouragement and dedication throughout all the PhD journey were invaluable. I thank my many friends in the PPGETI, for all the intra and extra campus conversations about the most diverse topics. You were very important to proportionate a suitable environment for all this period. I specially thank JoséDaniel Alencar and Amauri Holanda, with whom I had many valuable discussions. I am very grateful to Prof. Neil Lawrence, for kindly receiving me in his research group in the University of Sheffield. Neil has been a great intellectual inspiration and provided a significant boost to my studies. The way he works in his group and handles machine learning research will always have an influence on me. I was very lucky to be around Neil's ML group in SITraN for six months. I learned so much from the incredible people I met there. All the experiences we shared were critical for my adaptation and will be treasured. Thanks a lot, everyone! A special thanks to my co-authors, Andreas Damianou and Zhenwen Dai, who dedicated a lot of time to help me with my research (and are great guys!). I acknowledge the financial support of FUNCAP (for my PhD scholarship) and CAPES (for my time abroad via the PDSE program). Finally, this thesis is dedicated to my family, Fernando Lincoln, Carmen and Fernanda, whose support was fundamental to the completion of this work. \It is not knowledge, but the act of learning, not possession but the act of getting there, which grants the greatest enjoyment." (Carl Friedrich Gauss) RESUMO O estudo de sistemas dinâmicos encontra-se disseminado em várias áreas do conhecimento. Dados sequenciais s~aogerados constantemente por diversos fenômenos, a maioria deles n~ao pass´ıveis de serem explicados por equa¸c~oes derivadas de leis f´ısicas e estruturas conhecidas. Nesse contexto, esta tese tem como objetivo abordar a tarefa de identifica¸c~aode sistemas n~aolineares, por meio da qual s~ao obtidos modelos diretamente a partir de observa¸c~oes sequenciais. Mais especificamente, nós abordamos cenários desafiadores, tais como o aprendizado de rela¸c~oes temporais a partir de dados ruidosos, dados contendo valores discrepantes (outliers) e grandes conjuntos de dados. Na interface entre estat´ısticas, ciência da computa¸c~ao, análise de dados e engenharia encontra-se a comunidade de aprendizagem de máquina, que fornece ferramentas poderosas para encontrar padr~oesa partir de dados e fazer previs~oes. Nesse sentido, seguimos métodos baseados em Processos Gaussianos (PGs), uma abordagem probabil´ıstica prática para a aprendizagem de máquinas de kernel. A partir de avan¸cosrecentes em modelagem geral baseada em PGs, introduzimos novas contribui¸c~oes para o exerc´ıcio de modelagem dinâmica. Desse modo, propomos a nova fam´ılia de modelos de Processos Gaussianos Recorrentes (RGPs, da sigla em inglês) e estendemos seu conceito para lidar com requisitos de robustez a outliers e aprendizagem estocástica escalável. A estrutura hierárquica e latente (n~ao-observada) desses modelos imp~oeexpress~oesn~ao-anal´ıticas, que s~aoresolvidas com a deriva¸c~ao de novos algoritmos variacionais para realizar inferência determinista aproximada como um problema de otimiza¸c~ao. As solu¸c~oes apresentadas permitem a propaga¸c~ao da incerteza tanto no treinamento quanto no teste, com foco em realizar simula¸c~aolivre. Nós avaliamos em detalhe os métodos propostos com benchmarks artificiais e reais da área de identifica¸c~ao de sistemas, assim como outras tarefas envolvendo dados dinâmicos. Os resultados obtidos indicam que nossas proposi¸c~oes s~ao competitivas quando comparadas a modelos dispon´ıveis na literatura, mesmo nos cenários que apresentam as complica¸c~oes mencionadas, e que a modelagem dinâmica baseada em PGs éuma área de pesquisa promissora. Palavras-chave: Processos Gaussianos. Modelagem dinâmica. Identifica¸c~aode sistemas n~ao-lineares. Aprendizagem robusta. Aprendizagem estocástica. ABSTRACT The study of dynamical systems is widespread across several areas of knowledge. Sequential data is generated constantly by different phenomena, most of them that we cannot explain by equations derived from known physical laws and structures. In such context, this thesis aims to tackle the task of nonlinear system identification, which builds models directly from sequential measurements. More specifically, we approach challenging scenarios, such as learning temporal relations from noisy data, data containing discrepant values (outliers) and large datasets. In the interface between statistics, computer science, data analysis and engineering lies the machine learning community, which brings powerful tools to find patterns from data and make predictions. In that sense, we follow methods based on Gaussian Processes (GP), a principled, practical, probabilistic approach to learning in kernel machines. We aim to exploit recent advances in general GP modeling to bring new contributions to the dynamical modeling exercise. Thus, we propose the novel family of Recurrent Gaussian Processes (RGPs) models and extend their concept to handle outlier-robust requirements and scalable stochastic learning. The hierarchical latent (non- observed) structure of those models impose intractabilities in the form of non-analytical expressions, which are handled with the derivation of new variational algorithms to perform approximate deterministic inference as an optimization problem. The presented solutions enable uncertainty propagation on both training and testing, with focus on free simulation. We comprehensively evaluate the introduced methods with both artificial and real system identification benchmarks, as well as other related dynamical settings. The obtained results indicate that our propositions are competitive when compared to models available in the literature within the aforementioned complicated setups and that GP-based dynamical modeling is a promising area of research. Keywords: Gaussian Processes. Dynamical modeling. Nonlinear system identification. Robust learning. Stochastic learning. LIST OF FIGURES Figure 1 { Block diagrams of common evaluation methodologies for system identification. ................................... 25 Figure 2 { Examples of GP samples with different covariance functions. 34 Figure 3 { Graphical model detailing the relations between the variables in a standard GP model. 36 Figure 4 { Samples from the GP model before and after the observations. 38 Figure 5 { Illustration of the Bayesian model selection procedure. 39 Figure 6 { Simplified graphical models for the standard and augmented sparse GPs. 47 Figure 7 { Comparison of standard GP and variational sparse GP regression. 48 Figure 8 { Simplified graphical model for the GP-LVM. 49 Figure 9 { Graphical model for the Deep GP. 51 Figure 10 { Graphical model for the GP-NARX. 56 Figure 11 { Graphical model for the GP-SSM. 59 Figure 12 { RGP graphical model with H hidden

Recurrent Gaussian Processes and Robust Dynamical Modeling

You and AI Conversations About AI Technologies and Their Implications for Society

Annual Report of the Faculty 2018-19* *A Small Number of Items That Fall Into 2019-20 Which Are of Current Relevant Interest Are Also Included in This Report

Signature Redacted a Uthor

Beneficial AI 2017

Project Final Report

Nips 2015 Conference Book Montreal 2015

Gaussian Process Modelling for Audio Signals

The Ring, No LIII, 2020-01

Statistical Learning Approaches for the Control of Stormwater Systems

Variational Algorithms for Approximate Bayesian Inference

AI Sector Deal One Year On

Arxiv:1711.09883V2 [Cs.LG] 28 Nov 2017