Statistical Learning of Latent Data Models for Complex Data Analysis

Universite´ de Toulon Habilitation àDiriger les Recherches (HDR) Discipline : Informatique et Mathématiquesappliquées présentée par Faicel CHAMROUKHI Ma^ıtre de Conférences, UMR CNRS LSIS Statistical learning of latent data models for complex data analysis Soutenue publiquement le 07 décembre 2015 JURY Geoffrey McLachlan Professor, University of Queensland Rapporteur Australian Academy of Science Fellow Christophe Ambroise Professeur, Universitéd'Evry Rapporteur YounèsBennani Professeur, UniversitéParis Nord Rapporteur StéphaneDerrode Professeur, Ecole Centrale de Lyon Rapporteur Mohamed Nadif Professeur, UniversitéParis 5 Examinateur Christophe Biernacki Professeur, UniversitéLille 1, INRIA Examinateur HervéGlotin Professeur, Universitéde Toulon Examinateur Acknowledgements First, I would like to address all my thanks and express my gratitude to Professor Ge- off McLachlan, Professor Christophe Ambroise, Professor YounèsBennani and Professor StéphaneDerrode for having given me the honor of reviewing my habilitation and for the quality of their reports. A very special mention to Geoff; I was also very honored by your visit this year and greatly appreciated all the time you have spent in Toulon. You are inspiring me a lot to continue in this way in science. I would also like to address my very special thanks to Professor Christophe Biernacki, Pro- fessor Mohamed Nadif and Professor HervéGlotin for accepting to be part of my habilitation committee. A special warm mention to Professor HervéGlotin for these past four years of collaboration in Toulon. I would also like to address my special warm thanks to my colleagues of the DYNI team and the computer science department with whom it is a great pleasure to work. I also extend my warm thanks to all my colleagues of the LSIS lab and the faculty of science, with whom I have worked during these past four years, since I have been recruited in Toulon. This year I am in CNRS research leave and I would like to warmly thank my new colleagues at the lab of mathematics of Lille 1 and at INRIA-Modal for their warm welcome. A particular mention to Professor Christophe Biernacki for having given me the honor to join his team and for his support. I am also grateful to all the colleagues with whom I have collaborated during my research since the beginning of my PhD and I would like to address my thanks and express my gratitude to all of them at the Heudiasyc lab of UTC Compiègne,the Grettia group of IFSTTAR-Marne, the LiSSi lab of Paris 12, the LIPN lab of Paris 13, the LIPADE lab of Paris 5. A particular mention to my PhD advisors Doctor Allou Samé,Doctor Patrice Aknin, and Professor Gérard Govaert. I have been a teaching assistant at UniversitéParis 13 during my first four academic years and I take this opportunity to warmly thank all my colleagues at the computer science department with whom I have collaborated. I am also grateful to my former PhD and MSc students Dr. Marius Bartcus, Dr. Dorra Trabelsi, Dr. Rakia Jaziri, MSc. Ahmed Hosni, MSc. CélineRabouy, MSc. Ahmad Tay and MSc. Hiba Badri. Thanks to all of you for your collaboration and I wish you all the success in your career and in your life. Finally, I address my thanks to my friends and my family. This manuscript has been finished in august, at Hyères. Faicel Chamroukhi Lille, November 26, 2015 Contents 1 Introduction 1 1.1 Contributions during my thesis (2007-2010) . .2 1.2 Contributions after my thesis (2011-2015) . .2 1.2.1 Latent data models for non-stationary multivariate temporal data . .2 1.2.2 Functional data analysis . .2 1.2.3 Bayesian regularization of mixtures for functional data analysis . .3 1.2.4 Bayesian non-parametric parsimonious mixtures for multivariate data . .3 1.2.5 Non-normal mixtures of experts . .4 1.2.6 Applications . .4 2 Latent data models for temporal data segmentation 5 2.1 Introduction . .7 2.1.1 Personal contribution . .8 2.1.2 Problem statement . .9 2.2 Regression with hidden logistic process . .9 2.2.1 The model . .9 2.2.2 Maximum likelihood estimation via a dedicated EM . 10 2.2.3 Experiments . 12 2.2.4 Conclusion . 13 2.2.5 Multiple hidden process regression for joint segmentation of multivariate time series 13 2.3 Multiple hidden logistic process regression . 13 2.3.1 The model . 14 2.3.2 Maximum likelihood estimation via a dedicated EM . 14 2.3.3 Application on human activity time series . 15 2.3.4 Conclusion . 15 2.4 Multiple hidden Markov model regression . 16 2.4.1 The model . 17 2.4.2 Maximum likelihood estimation via a dedicated EM . 17 2.4.3 Application on human activity time series . 17 2.4.4 Conclusion . 18 3 Latent data models for functional data analysis 19 3.1 Introduction . 21 3.1.1 Personal contribution . 22 3.1.2 Mixture modeling framework for functional data . 22 3.2 Mixture of piecewise regressions . 23 3.2.1 The model . 23 3.2.2 Maximum likelihood estimation via a dedicated EM . 24 3.2.3 Maximum classification likelihood estimation via a dedicated CEM . 25 3.2.4 Experiments . 26 3.2.5 Conclusion . 28 3.3 Mixture of hidden Markov model regressions . 30 3.3.1 The model . 30 3.3.2 Maximum likelihood estimation via a dedicated EM . 31 iii 3.3.3 Experiments . 32 3.3.4 Conclusion . 33 3.4 Mixture of hidden logistic process regressions . 34 3.4.1 The model . 34 3.4.2 Maximum likelihood estimation via a dedicated EM algorithm . 35 3.4.3 Experiments . 36 3.4.4 Conclusion . 37 3.5 Functional discriminant analysis . 37 3.5.1 Functional linear discriminant analysis . 38 3.5.2 Functional mixture discriminant analysis . 38 3.5.3 Experiments . 39 3.5.4 Conclusion . 40 4 Bayesian regularization of mixtures for functional data 43 4.1 Introduction . 45 4.1.1 Personal contribution . 46 4.1.2 Regression mixtures . 46 4.2 Regularized regression mixtures for functional data . 47 4.2.1 Introduction . 47 4.2.2 Regularized maximum likelihood estimation via a robust EM-like algorithm . 49 4.2.3 Experiments . 51 4.2.4 Conclusion . 53 4.3 Bayesian mixtures of spatial spline regressions . 54 4.3.1 Bayesian inference by Markov Chain Monte Carlo (MCMC) sampling . 54 4.3.2 Mixtures of spatial spline regressions with mixed-effects . 55 4.3.3 Bayesian spatial spline regression with mixed-effects . 57 4.3.4 Bayesian mixture of spatial spline regressions with mixed-effects . 59 4.3.5 Experiments . 61 4.3.6 Conclusion . 62 5 Bayesian non-parametric parsimonious mixtures for multivariate data 63 5.1 Introduction . 65 5.1.1 Personal contribution . 67 5.2 Finite mixture model model-based clustering . 67 5.2.1 Bayesian model-based clustering . 68 5.2.2 Parsimonious Gaussian mixture models . 68 5.3 Dirichlet Process Parsimonious Mixtures . 69 5.3.1 Dirichlet Process Parsimonious Mixtures . 69 5.3.2 Chinese Restaurant Process parsimonious mixtures . 71 5.3.3 Bayesian inference via Gibbs sampling . 72 5.3.4 Bayesian model comparison via Bayes factors . 73 5.3.5 Experiments . 74 5.4 Conclusion . 77 6 Non-normal mixtures of experts 79 6.1 Introduction . 81 6.1.1 Personal contribution . 82 6.1.2 Mixture of experts for continuous data . 83 6.1.3 The normal mixture of experts model and its MLE . 83 6.2 The skew-normal mixture of experts model . 84 6.2.1 The model . 84 6.2.2 Maximum likelihood estimation via the ECM algorithm . 85 6.3 The t mixture of experts model . 88 6.3.1 The model . ..

Statistical Learning of Latent Data Models for Complex Data Analysis

Mixture Models with Grouping Structure: Retail Analytics Applications Haidar Almohri Wayne State University

Mini-Batch Learning of Exponential Family Finite Mixture Models Hien D Nguyen, Florence Forbes, Geoffrey Mclachlan

An Introduction to MM Algorithms for Machine Learning and Statistical

Econometrics and Statistics (Ecosta 2018)

Mixture Model Clustering in the Analysis of Complex Diseases

Sponsored by the International Biometric Society

Computational Statistics & Data Analysis

Vine Copula Mixture Models and Clustering for Non-Gaussian Data

Learning Mixtures of Plackett-Luce Models

Information Bulletin Nº 2

THREE ESSAYS on MIXTURE MODEL and GAUSSIAN PROCESSES a Dissertation by WENBIN WU Submitted to the Office of Graduate and Profess

Computational and Financial Econometrics (CFE 2018)