Response Selection for End-To-End Retrieval-Based Dialogue Systems Basma El Amel Boussaha
Total Page:16
File Type:pdf, Size:1020Kb
Response Selection for End-to-End Retrieval-Based Dialogue Systems Basma El Amel Boussaha To cite this version: Basma El Amel Boussaha. Response Selection for End-to-End Retrieval-Based Dialogue Systems. Computation and Language [cs.CL]. Université de Nantes (UN), 2019. English. tel-02926608 HAL Id: tel-02926608 https://hal.archives-ouvertes.fr/tel-02926608 Submitted on 31 Aug 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE DE DOCTORAT DE L’UNIVERSITE DE NANTES COMUE UNIVERSITE BRETAGNE LOIRE Ecole Doctorale N°601 Mathèmatique et Sciences et Technologies de l’Information et de la Communication Spécialité : Informatique Par Basma El Amel BOUSSAHA Response Selection for End-to-End Retrieval-Based Dialogue Systems Thèse présentée et soutenue à NANTES le 23 Octobre 2019 Unité de recherche : Laboratoire des Sciences du Numérique de Nantes (LS2N) Rapporteurs avant soutenance : Frédéric BÉCHET Professeur des universités, Aix Marseille Université, LIS Sophie ROSSET Directrice de recherche, Université Paris-Sud, LIMSI Composition du jury : Président : Yannick ESTÈVE Professeur des universités, Université d’Avignon, LIA Examinateurs : Frédéric BÉCHET Professeur des universités, Aix Marseille Université, LIS Sophie ROSSET Directrice de recherche, Université Paris-Sud, LIMSI Yannick ESTÈVE Professeur des universités, Université d’Avignon, LIA Dir. de thèse : Emmanuel MORIN Professeur des universités, Université de Nantes, LS2N Co-enc. de thèse : Christine JACQUIN Maître de Conférences, Université de Nantes, LS2N Nicolas HERNANDEZ Maître de Conférences, Université de Nantes, LS2N “ It is said that to explain is to explain away. This maxim is nowhere so well fulfilled as in the area of computer programming, especially in what is called heuristic programming and artificial intelli- gence. For in those realms machines are made to behave in wondrous ways, often sufficient to daz- zle even the most experienced observer. But once a particular program is unmasked, once its inner workings are explained in language sufficiently plain to induce understanding, its magic crumbles away; it stands revealed as a mere collection of procedures, each quite comprehensible. The ob- server says to himself "I could have written that". With that thought he moves the program in ques- tion from the shelf marked "intelligent", to that re- served for curios fit to be discussed only with peo- ple less enlightened than he.” — Weizenbaum(1966) 3 Acknowledgment I would like to express my gratitude and thanks to all the people who accompanied me throughout this thesis. I thank my supervisor Emmanuel MORIN and my co-supervisors Christine JACQUIN and Nicolas HERNANDEZ for their valuable and hard work during these three years to make this PhD succeed. I also thank them for their integrity, their support, their availability and their generosity to share their knowledge. Thank you to each of them for their time, effort and involvement which allowed me to finish my PhD on time. My sincere thanks go to Frédéric BÉCHET, Sophie ROSSET and Yannick ESTÈVE for taking part of my thesis committee and for providing valuable comments. I thank Sophie again and Hoël LE CAPITAINE for taking part of my thesis supervision committee and for their valuable remarks and questions during our annual meetings. I would also like to thank all the TALN team members who have accompanied me from near or far and shared with me this journey. Thanks to everyone I have met in the lab for the endless but always constructive debates. A special thank to Florian BOUDIN for all our interesting discussions around lunch and coffee breaks. I especially thank my family for their endless love, support and encouragement through- out the course of this research. I am also grateful to my friends for their presence, their help and their emotional support. Finally, I thank the one who believed in me, supported me and stood by my side through the best and the worst, my dearest husband Karim. 5 Contents 1 Introduction 13 1.1 Context.................................... 13 1.2 Problem and motivations........................... 17 1.3 Contributions................................. 18 1.4 Outline.................................... 19 2 Dialogue systems 21 2.1 Introduction.................................. 21 2.2 Dialogue structures and dynamics...................... 22 2.3 Categories of dialogue systems........................ 24 2.3.1 Generative dialogue systems..................... 24 2.3.2 Retrieval-based dialogue systems.................. 27 2.3.3 Hybrid dialogue systems....................... 33 2.3.4 Synthesis............................... 35 2.4 Incorporating external knowledge...................... 35 2.5 Machine learning foundation......................... 37 2.5.1 Encoder-decoder........................... 37 2.5.2 Attention mechanism......................... 38 2.6 Evaluation metrics.............................. 41 2.6.1 Generative metrics.......................... 41 2.6.2 Retrieval metrics........................... 43 2.6.3 Discussion.............................. 44 2.7 Conclusion.................................. 46 3 Resources 49 3.1 Introduction.................................. 49 3.2 Datasets.................................... 50 3.2.1 Negative sampling based datasets.................. 51 3.2.2 Human-labeled datasets....................... 56 3.3 Conclusion.................................. 61 4 Single-level context response matching system 63 4.1 Introduction.................................. 63 4.2 Approach................................... 64 4.2.1 Sequence encoding.......................... 65 7 8 Contents 4.2.2 Sequence-level similarity....................... 66 4.2.3 Response score............................ 66 4.3 Experimental setup.............................. 67 4.3.1 Baseline systems........................... 67 4.3.2 System parameters.......................... 68 4.4 Results analysis................................ 68 4.4.1 Prediction analysis.......................... 69 4.4.2 Qualitative and quantitative analysis................. 70 4.5 Incorporation of the attention mechanism.................. 72 4.6 Conclusion.................................. 74 5 Multi-level context response matching system 77 5.1 Introduction.................................. 77 5.2 Approach................................... 78 5.2.1 Word-level similarity......................... 79 5.2.2 Response score............................ 80 5.3 Experimental setup.............................. 80 5.3.1 Baseline systems........................... 80 5.3.2 System parameters.......................... 81 5.4 Results and analysis.............................. 81 5.4.1 Results................................ 82 5.4.2 Error analysis............................. 83 5.4.3 Visualization............................. 85 5.4.4 Model ablation............................ 86 5.5 Dialog System Technology Challenge (DSTC7)............... 86 5.5.1 Task description........................... 87 5.5.2 Sentence selection track....................... 87 5.5.3 System description.......................... 88 5.5.4 Experimental setup.......................... 89 5.5.5 Results and analysis......................... 91 5.6 Conclusion.................................. 96 6 Conclusion and perspectives 99 6.1 Conclusion.................................. 99 6.2 Perspectives.................................. 101 List of publications 105 Publications in international conferences...................... 105 Publications in national conferences........................ 106 Publications in international journals........................ 107 Preprints....................................... 107 Additional publications............................... 108 Bibliography 109 List of Tables 2.1 Characteristics of retrieval-based and generative dialogue systems...... 34 2.2 An example of zero BLEU score for an acceptable generated response in multi-turn dialogue system (Ghazarian et al., 2019).............. 45 3.1 Characteristics of different versions of the Ubuntu Dialogue Corpus..... 52 3.2 Characteristics of the Advising corpus..................... 53 3.3 Characteristics of the MSDialog corpus.................... 55 3.4 Characteristics of the E-commerce Dialogue Corpus............. 55 3.5 Characteristics of the Douban Corpus..................... 58 3.6 Characteristics of the AliMe Corpus (Yang et al., 2018)............ 59 3.7 Statistics on the datasets. C, R and cand. denote context, response and candidate respectively............................. 60 4.1 Evaluation results on the UDC V1 and Douban Corpus using retrieval metrics. 68 4.2 An extracted example from the test set where our system successfully re- trieved the ground truth response (written in bold).............. 69 4.3 An extracted example from the test set on which our system failed in re- trieving the ground truth response (written in bold).............. 70 4.4 Examples of agreement and disagreement between our system and the base- line system Lowe et al.(2015b). Scores in bold are the highest scores ob- tained