Influencing an Artificial Conversational Entity by Information Fusion
Total Page:16
File Type:pdf, Size:1020Kb
University of West Bohemia Department of Computer Science and Engineering Univerzitn´ı8 30614 Plzeˇn Czech Republic Influencing an artificial conversational entity by information fusion Technical report Ing. Jarom´ırSalamon Technical report No. DCSE/TR-2020-04 June 2020 Distribution: public Technical report No. DCSE/TR-2020-04 June 2020 Influencing an artificial conversational entity by information fusion Technical report Ing. Jarom´ırSalamon Abstract The dissertation thesis proposes a new method of using an artificial conversational enti(later also dialogue system or chatbot) influenced by information or information fusion. This new method could potentially serve various purposes of use like fitness and well-being support, mental health support, mental illness treatment, and a like. To propose such a new method, one has to investigate two main topics. Whether it is possible to influence the dialogue system by information (fusion) and what kind of data needs to be collected and prepared for such influence. The dialogue system uses textual data from conversation to determine the context of human interaction and decide about next response. It presents correct behavior without external influence with data, but with the influence, the dialogue system needs to react without hiccups in the conversation adequately. The data which could be used for the dialogue system influencing can be a combina- tion of qualitative measure (from text extracted sentiment, from voice determined tone, from face revealed emotion), and measured quantitative values (wearable mea- sured heartbeat, on-camera correctly performed exercise, based on EEG found focus on activity). All the research found in the relation of those two topics is described in next more than 200 pages. It is supported with more than 500 references from former ones up to the most recent, including elementary solutions up to state of the art. Copies of this report are available on http://www.kiv.zcu.cz/en/research/publications/ or by surface mail on request sent to the following address: University of West Bohemia Department of Computer Science and Engineering Univerzitn´ı8 30614 Plzeˇn Czech Republic Copyright © 2020 University of West Bohemia, Czech Republic Acknowledgements First of all, I would like to express my endless gratitude to my doctoral study supervisor, Dr. Roman Mouˇcek.He is supporting all my ideas and he is patient with me all the time. Moreover, he is willing to read and review all the text several times over and over. Thanks should be also given to several of my students who either worked on the topics related to my research or on the independent works specified by me. They brought me a comprehensive perspective about close research topics or use and exploration of the data which were collected during my research. Specifically, I would like to thank to: • Tom´aˇs Simandlˇ to give me an insight into the precision of the wearable devices measuring Heart Rate(HR) when I was leading and adjusting the way of his work on his bachelor thesis [1]. • Luk´aˇsLihl and the team of fellow students who implemented the extraction of Twitter and Fitbit data via API as their semester assignment. • Radek Juppa who worked on the proof of concept idea related to data streaming into a cloud storage during his work on the semester assignment. • Milan Kuda who processed the data collected during the Quasi-experiment(QX) and used them as an input to machine learning methods. I was leading and supervising his diploma thesis [2] on the topic of extracting sentiment from theHR. • Stˇep´anˇ Sevˇc´ıkandˇ Milan Tuˇslwho were challenged with the topic of influencing a chatbot with external data during their semester assignment. • Jakub Frank, Martin Ryba, and Petr Vintr who worked on the semester assignment dealing with ANT+ communication and translation ofHR data into a reading form. • Martin Ryba who is designing and implementing the idea of arm rehabilitation when video motion detection is used. • Ahmad Aldin Yusmar, the internship student from Universiti Teknologi PETRONAS, Malaysia, who continually challenged my dialogue system influencing ideas. Last but not least, I would like to express big thanks to my family, especially my wife Iva who supported me during the whole time when I was writing this thesis and also to our two children (Maya and Filip) who kept me busy and let me relax from working. iii List of Tables 2.1 Loebner Prize Summary............................ 12 2.2 Alexa Prize Challenge(APC) Summary 2017................. 13 2.3 Alexa Prize Challenge(APC) Summary 2018................. 14 2.4 Dialogue System Technology Challenges................... 16 2.5 The Conversational Intelligence Challenge.................. 16 2.6 ConvAI Summary 2017............................. 17 2.7 ConvAI Summary 2018............................. 17 5.1 Sentiment andHR in matrix representation................. 47 5.2 Stress dichotomy in matrix representation.................. 47 6.1 Slots, domain and intent parsing example for finding the flight [160].... 61 6.2 Slots, domain and intent parsing example for order first class flight [162]. 61 8.1 Examples of hard and soft targets [320].................... 94 8.2 BERT vs. other models comparison [323]................... 95 8.3 Slot filling and delexicalization example for finding the flight corresponding to Table 6.1................................... 108 iv List of Figures 1.1 Introduction to the idea of dialogue system influencing...........2 2.1 Dialogue Systems and Chatbots classification by Jurafsky et al.......8 2.2 A Survey on Dialogue Systems classification by Chen et al..........9 2.3 Cognitive-behavioral therapy diagram..................... 21 3.1 Representative information elements according to generating source (header row) and classification between hard and soft information.......... 27 4.1 Influencing data................................. 32 5.1 Data Fusion................................... 39 5.2 Sentiment data transformations........................ 41 5.3 Zero-order Hold Sentiment Reconstruction.................. 42 5.4 First-order Hold Sentiment Reconstruction.................. 42 5.5 Sentiment interpolation by splitting the interval............... 43 5.6 Sentiment interpolation using the moving window.............. 43 5.7 Heart Rate data transformations....................... 45 5.8 Six basic emotions in dimensional space.................... 48 5.9 Stress dichotomy in dimensional space.................... 48 6.1 Dialogue System - introduction........................ 51 6.2 Dialogue system pipeline architecture..................... 53 6.3 Vertically divided pipeline architecture.................... 54 6.4 Horizontally divided pipeline architecture................... 54 6.5 Principal components of a spoken dialogue system in a pipeline architecture 55 6.6 End-to-End dialogue system architecture................... 55 6.7 Horizontally divided End-to-End architecture................ 56 6.8 Retrieval Model Schema............................ 56 6.9 Generative Model Schema........................... 57 6.10 Dialogue system attributes........................... 59 6.11 Dialogue systems complexity.......................... 59 6.12 Dialogue Management Elements........................ 62 6.13 Dialogue Policy - Reward............................ 64 6.14 Illustration of trade-offs between using rule-based (template-based) vs. neural (corpus-based) text generation systems [194]............. 66 6.15 On the Rorschach's inkblot (in the middle) Norman (left) sees only death and the Standard AI (right) a nice and expected outcome.......... 71 6.16 From nice start up to racism including offensive content during less then twenty-four hours................................. 72 v LIST OF FIGURES vi 7.1 Corpora usage.................................. 73 8.1 Dialogue System - models........................... 81 8.2 Artificial Neuron Schema............................ 83 8.3 Shallow and deep learning ANN........................ 83 8.4 The repeating module in a standard RNN contains a single layer...... 84 8.5 The repeating module in an LSTM network contains four interacting layers 85 8.6 The repeating module in an GRU contains three interacting layers..... 85 8.7 Modern Word Embedding Timeline...................... 87 8.8 Modern Sentence Embedding Timeline.................... 88 8.9 Sequence to Sequence Model.......................... 89 8.10 Transformer Evolution............................. 90 8.11 Parameter counts of several recently released pre-trained language models. 93 8.12 Joint training and distillation approach to learn compact student models. 95 8.13 Dialogue systems classification......................... 96 8.14 Single-turn Response Selection Dialogue System............... 99 8.15 Self-feeding chatbot scheme.......................... 105 8.16 The Hybrid Code Network(HCN) model overview.............. 107 8.17 Overall structure of extended HCN...................... 107 8.18 Dialogue System Pipeline............................ 108 8.19 Review of NLU approaches classification................... 109 8.20 Review of DST approaches classification extended withDP approaches classification from other state of the art papers (§2.3)............ 110 8.21 Review of NLG approaches classification................... 112 9.1 Idea of dialogue system influencing technique................. 119 9.2 Dialogue system influencing logic....................... 120 9.3 Dialogue system standard influencing approach............... 121 9.4 Dialogue system extended influencing approach............... 121 9.5 Dialogue