Synthesizing Adversarial Negative Responses for Robust Response Ranking and Evaluation Prakhar Gupta| Yulia Tsvetkov♠ Jeffrey P. Bigham|;~ |Language Technologies Institute, Carnegie Mellon University ♠Paul G. Allen School of Computer Science & Engineering, University of Washington ~Human-Computer Interaction Institute, Carnegie Mellon University
[email protected],
[email protected],
[email protected] Abstract to a provided context, consisting of past dialogue turns. Dialogue ranking (Zhou et al., 2018; Wu Open-domain neural dialogue models have et al., 2019) and evaluation models (Tao et al., achieved high performance in response rank- 2018; Yi et al., 2019; Sato et al., 2020), in turn, are ing and evaluation tasks. These tasks are deployed to select and score candidate responses formulated as a binary classification of re- sponses given in a dialogue context, and mod- according to coherence and appropriateness. els generally learn to make predictions based Ranking and evaluation models are generally on context-response content similarity. How- trained using true positive responses and randomly ever, over-reliance on content similarity makes selected negative responses, which raises two is- the models less sensitive to the presence of in- sues. First, random negative candidates often have consistencies, incorrect time expressions and low content similarity with the context, and thus other factors important for response appro- models learn to associate response coherence and priateness and coherence. We propose ap- proaches for automatically creating adversar- appropriateness with content similarity (Yuan et al., ial negative training data to help ranking and 2019; Whang et al., 2021; Sai et al., 2020). In evaluation models learn features beyond con- real systems, generated response candidates tend tent similarity.