Information Seeking in the Spirit of Learning: A Dataset for Conversational Curiosity Pedro Rodriguez∗ Paul Crook Computer Science Facebook University of Maryland
[email protected] [email protected] Seungwhan Moon Zhiguang Wang Facebook Facebook
[email protected] [email protected] Abstract U: <assistant wake-word>, tell me about Tahiti. A: It’s the largest island in French Polynesia, near Open-ended human learning and information- the center of the Pacific seeking are increasingly mediated by digi- U: What is its history with France? tal assistants. However, such systems of- ten ignore the user’s pre-existing knowledge. Assuming a correlation between engagement Figure 1: An example of information-seeking dialog and user responses such as “liking” messages that the Curiosity dataset aims to support. Assistants or asking followup questions, we design a should answer user questions and convey information Wizard-of-Oz dialog task that tests the hypoth- that inspires meaningful followup questions. esis that engagement increases when users are presented with facts related to what they know. Through crowd-sourcing of this experiment, Theories of human learning, such as Vygotsky’s we collect and release 14K dialogs (181K ut- zone of proximal development, propose that learn- terances) where users and assistants converse ing novel information should be rooted in pre- about geographic topics like geopolitical enti- existing knowledge and skills of the learner (Chaik- ties and locations. This dataset is annotated lin, 2003). Considering this, a good policy may with pre-existing user knowledge, message- give general information about Tahiti; a better pol- level dialog acts, grounding to Wikipedia, and user reactions to messages.