Action Model Learning for Socio-Communicative Human Robot Interaction Ankuj Arora
Total Page:16
File Type:pdf, Size:1020Kb
Action Model Learning for Socio-Communicative Human Robot Interaction Ankuj Arora To cite this version: Ankuj Arora. Action Model Learning for Socio-Communicative Human Robot Interaction. Artifi- cial Intelligence [cs.AI]. Université Grenoble Alpes, 2017. English. NNT : 2017GREAM081. tel- 01876157 HAL Id: tel-01876157 https://tel.archives-ouvertes.fr/tel-01876157 Submitted on 18 Sep 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE Pour obtenir le grade de DOCTEUR DE LA COMMUNAUTÉ UNIVERSITÉ GRENOBLE ALPES Spécialité : Mathématiques et Informatique Arrêté ministériel : 25 mai 2016 Présentée par Ankuj ARORA Thèse dirigée par Sylvie PESTY, Professeur, UGA préparée au sein du Laboratoire Laboratoire d'Informatique de Grenoble dans l'École Doctorale Mathématiques, Sciences et technologies de l'information, Informatique Apprentissage du modèle d'action pour une interaction socio-communicative des hommes-robots Action Model Learning for Socio- Communicative Human Robot Interaction Thèse soutenue publiquement le 8 décembre 2017, devant le jury composé de : Madame SYLVIE PESTY PROFESSEUR, UNIVERSITE GRENOBLE ALPES, Directeur de thèse Monsieur CEDRIC BUCHE MAITRE DE CONFERENCES, ECOLE NATIONALE D'INGENIEURS DE BREST, Rapporteur Monsieur ALEXANDRE PAUCHET MAITRE DE CONFERENCES, INSA ROUEN, Rapporteur Monsieur DOMINIQUE DUHAUT PROFESSEUR, UNIVERSITE BRETAGNE-SUD, Président Monsieur SAMIR AKNINE PROFESSEUR, UNIVERSITE LYON 1, Examinateur Madame SOPHIE DUPUY-CHESSA PROFESSEUR, UNIVERSITE GRENOBLE ALPES, Examinateur Monsieur DAMIEN PELLIER MAITRE DE CONFERENCES, UNIVERSITE GRENOBLE ALPES, Examinateur Monsieur HUMBERT FIORINO MAITRE DE CONFERENCES, UNIVERSITE GRENOBLE ALPES, Examinateur iii University of Grenoble-Alpes Abstract Faculty Name Informatics Laboratory of Grenboble Doctor of Philosophy Action Model Learning for Socio-Communicative Human Robot Interaction by Ankuj Arora Driven with the objective of rendering robots as socio-communicative, there has been a heightened interest towards researching techniques to endow robots with social skills and “commonsense” to render them acceptable. This “commonsense” is not so common, as even a standard dialogue exchange integrates behavioral subtleties that are difficult to codify. In such a scenario, learning the behavioral model of the robot is a promising approach. This thesis tries to solve the problem of learning robot behavioral models in the Automated Plan- ning and Scheduling (APS) paradigm of AI. During the course of this thesis, we introduce various symbolic and deep learning systems, by the names SPMSAT and PDeepLearn re- spectively, which facilitate the learning of action models, and extend the scope of these new techniques to learn robot behavioral models. The long term objective is to empower robots to communicate autonomously with humans without the need of “wizard” intervention. Résumé en Français Conduite dans le but de rendre les robots comme socio-communicatifs, il y a eu un intérêt accru pour les techniques de recherche pour doter les robots de compé- tences sociales et de sens commun pour les rendre acceptables. Ce «sens commun» n’est pas si commun, car même un échange de dialogue standard intègre des subtilités comporte- mentales difficiles à codifier. Dans un tel scénario, l’apprentissage du modèle comporte- mental du robot est une approche prometteuse. Cette thèse tente de résoudre le problème de l’apprentissage des modèles comportementaux du robot dans le paradigme de planifica- tion automatisée et d’ordonnancement (APS) de l’IA. Au cours de cette thèse, nous intro- duisons divers systèmes d’apprentissage symbolique et approfondi, par les noms SPMSAT et PDeepLearn, respectivement, qui facilitent l’apprentissage des modèles d’action et étendent la portée de ces nouvelles techniques pour apprendre les modèles de comportement du robot. L’objectif à long terme d’habiliter les robots à communiquer de manière autonome avec les humains sans avoir besoin d’une intervention "wizard". v Acknowledgements Firstly, I would like to thank my supervisor Prof. Sylvie Pesty for granting me the op- portunity to work on this topic. Further, I am grateful to the MAGMA team, especially Dr. Julie Dugdale, for their continued support during the course of my thesis. I extend my grat- itude towards all the members of the ANR-SOMBRERO project as well. I would also like to thank Prof. James Crowley, Prof. Gaelle Calvary and Prof. Denis Trystram for all the love, guidance and support they have shown me during the course of my studies in Grenoble, France. Last but not the least, I would like to thank the love of my life Dr. Elisa Vitiello, mother Prabha and sister Geetika for standing by my side through all the logical and illogical deci- sions I have taken throughout my life. Especially to my mother, who has always been my pillar of strength through thick and thin. vii Contents Abstract iii Acknowledgementsv 1 Introduction1 1.1 Motivation....................................1 1.2 Problem Statement...............................3 1.3 Contribution of this Study...........................4 1.4 Thesis Outline..................................7 2 Automated Planning- Theory, Language and Applications 11 2.1 Introduction................................... 11 2.2 AP Formulation and Representation...................... 11 2.3 Languages in Automated Planning....................... 13 2.4 Applications of AP............................... 17 2.5 Automated Planning in HRI.......................... 18 2.6 Conclusion................................... 20 3 Learning Systems in Automated Planning 21 3.1 Introduction................................... 21 3.2 Machine Learning (ML) in AP......................... 21 3.3 Characteristics of Learning Systems...................... 21 3.3.1 Representation Mechanism....................... 22 3.3.2 Inputs To The Learning System.................... 22 Model.................................. 22 Background Knowledge........................ 22 Traces.................................. 23 3.3.3 System Outputs............................. 23 Action Granularity........................... 23 State Observability and Action Effects................. 23 3.3.4 Learnt Model Granularity....................... 24 3.4 Type of Learning................................ 24 3.4.1 Learning Techniques based on availability of background knowledge 24 Inductive Learning........................... 25 Analytical Learning.......................... 27 3.4.2 Genetic and Evolutionary algorithm based approaches........ 27 3.4.3 Reinforcement Learning........................ 27 Relational Reinforcement Learning.................. 28 3.4.4 Uncertainty-based techniques..................... 29 Markov Logic Networks........................ 29 Noisy Trace Treatment Approaches.................. 29 3.4.5 Surprise Based Learning (SBL).................... 30 3.4.6 Transfer Learning............................ 31 viii 3.4.7 Deep Learning............................. 32 3.4.8 MAX-SAT based approaches...................... 33 3.4.9 Supervised Learning Based Approaches................ 34 3.5 Algorithmic “Cheat Sheet”........................... 40 3.6 Our Approach vis-a-vis the Literature..................... 40 3.7 Conclusion................................... 41 4 SAT-based Techniques: SRMLearn 43 4.1 Introduction................................... 43 4.2 The SRMLearn Algorithm........................... 43 4.2.1 Generation of Traces.......................... 44 4.2.2 Annotation and Generalization..................... 45 4.2.3 Constraint Generation......................... 46 Intra-operator constraints........................ 46 Soft Constraints............................. 46 4.2.4 Constraint Resolution and Model Reconstruction........... 50 4.3 Evaluation.................................... 50 4.4 Future Perspectives: Learning temporal models................ 52 4.4.1 Temporal intra-operator and inter-operator constraints........ 55 4.5 Conclusions................................... 57 5 Connectionist Techniques: The PDeepLearn System 59 5.1 Introduction................................... 59 5.2 The PDeepLearn Algorithm.......................... 60 5.2.1 Overview and Definitions....................... 60 5.2.2 Candidate model Generation...................... 62 5.2.3 Semantic Constraint Adherence.................... 63 5.2.4 Sequence Pattern Mining........................ 64 5.2.5 Action Pair Constraint Adherence................... 64 5.2.6 LSTM-based Classification....................... 65 LSTM Background........................... 65 Data Encoding for Labelling of Action Sequences........... 66 Training and Validation Phase..................... 67 5.3 Evaluation.................................... 68 5.3.1 All Candidate Model Generation.................... 68 5.3.2 Sequence Pattern Mining........................ 69 5.3.3 Candidate Model Elimination..................... 69 5.3.4 LSTM Based Speculated Ideal Model Identification.......... 69 5.4 Conclusion................................... 71 6 Towards application of learning