Multiresolution Recurrent Neural Networks: an Application to Dialogue Response Generation
Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation Iulian Vlad Serban∗◦ Tim Klinger University of Montreal IBM Research 2920 chemin de la Tour, T. J. Watson Research Center, Montréal, QC, Canada Yorktown Heights, NY, USA Gerald Tesauro Kartik Talamadupula Bowen Zhou IBM Research IBM Research IBM Research T. J. Watson Research Center, T. J. Watson Research Center, T. J. Watson Research Center, Yorktown Heights, Yorktown Heights, Yorktown Heights, NY, USA NY, USA NY, USA Yoshua Bengioy◦ Aaron Courville◦ University of Montreal University of Montreal 2920 chemin de la Tour, 2920 chemin de la Tour, Montréal, QC, Canada Montréal, QC, Canada Abstract We introduce the multiresolution recurrent neural network, which extends the sequence-to-sequence framework to model natural language generation as two parallel discrete stochastic processes: a sequence of high-level coarse tokens, and a sequence of natural language tokens. There are many ways to estimate or learn the high-level coarse tokens, but we argue that a simple extraction procedure is sufficient to capture a wealth of high-level discourse semantics. Such procedure allows training the multiresolution recurrent neural network by maximizing the exact joint log-likelihood over both sequences. In contrast to the standard log- likelihood objective w.r.t. natural language tokens (word perplexity), optimizing the joint log-likelihood biases the model towards modeling high-level abstractions. We apply the proposed model to the task of dialogue response generation in arXiv:1606.00776v2 [cs.CL] 14 Jun 2016 two challenging domains: the Ubuntu technical support domain, and Twitter conversations. On Ubuntu, the model outperforms competing approaches by a substantial margin, achieving state-of-the-art results according to both automatic evaluation metrics and a human evaluation study.
[Show full text]