Neural Approaches to Conversational Information Retrieval (The Information Retrieval Series)
Total Page:16
File Type:pdf, Size:1020Kb
Recent Advances in Conversational Information Retrieval (CIR) - A review of neural approaches Jianfeng Gao, Chenyan Xiong, Paul Bennett Microsoft Research SIGIR 2020 July 26, 2020 (Xi’an, China) Outline • Part 1: Introduction • A short definition of CIR • Task-oriented dialog and Web search • Research tasks of CIR • Part 2: Conversational question answering (QA) methods • Part 3: Conversational search methods • Part 4: Overview of public and commercial systems Who should attend this tutorial? • Whoever wants to understand and develop modern CIR systems that • Can interact with users for information seeking via multi-turn dialogs • Can answer questions • Can help users search / look up information • Can help users with learning and investigation tasks • … • Focus on neural approaches in this tutorial • Hybrid approaches that combine classical AI methods and deep learning methods are widely used to build real-world systems A short definition of CIR • A Conversational Information Retrieval (CIR) system is • an information retrieval (IR) system with • a conversational interface which • allows users to interact with the system to seek information • via multi-turn conversations of natural language. • CIR is a task-oriented dialog system (aka. task-oriented bot) • Complete tasks (e.g., information seeking) via multi-turn conversations of natural language Classical task-oriented dialog system architecture Dialog Manager (DM) Find me a restaurant Language intent: get_restaurant Dialog state serving understanding food: chinese tracking Chinese food Web or words meaning state Database (APIs) Which area do you Language intent: ask_slot Policy prefer? generation slot: area (action selection) User-System interaction in task-oriented dialog • User: know (roughly) what she needs, but not what is available. • System: know what is available, but not what a user needs • dialog: a two-way process of knowing each other to make a deal A (multi-domain) dialog session DB State Belief State H: I am looking for a museum in the center of town. do you have any 1 suggestion? Belief state: { M: there are 6 museums in the centre of town . how about the Broughton ‘attraction’:{‘type’: ‘museum’, 2 ‘area’: ‘centre’} Attraction house gallery at 98 king street ? } 3 H: nice, can you tell me what is the phone number and address ? M: the phone number for the Broughton house gallery is 01223314960 and 4 they are located at 98 king street . H: ok thanks, can you recommend a restaurant in the same area serving Indian 5 food M: curry garden serves Indian food and is located in the centre . would you like 6 me to make a reservation ? 7 H: sure, please book a table for two people on thursday at 18:00 Restaurant Belief state: { M: ok , i have a table for 2 at 18:00 on Thursday . your reference number is 8 ‘restaurant’: {‘food’: ‘indian’, ‘area’: 1234567890 . is there anything else I can help you with ? ‘centre’}, ‘booking’: {‘day’: ‘Thursday’, 9 H: that's it, bye. ‘people’: ‘2’, ‘time’: ’18:00’}, ‘attraction’:{‘type’: ‘museum’, M: have a great time and please contact us if you need any further assistance . ‘area’: ‘centre’} 10 goodbye . } [Peng+20] User-system interaction in Web search • User: know (roughly) what she needs, but not what is available. • System: know what is available, but not what a user needs • Generally viewed as a one-way information seeking process • User plays a proactive role to iteratively • issue a query, • inspect search results, • reformulate the query • System plays a passive role to make search more effective • Autocomplete a query • Organize search results (SERP) • Suggest related queries System should interact with users more actively • How people search -- Information seeking • Information lookup – short search sessions; • Exploratory search based on a dynamic model, an iterative “sense-making” process where users learn as they search, and adjust their information needs as they see search results. • Effective information seeking requires interaction btw users and a system that explicitly models the interaction by • Tracking belief state (user intent) • Asking clarification questions • Providing recommendations • Using natural language as input/output [Hearst+11; Collins-Thompson+ 17; Bates 89] A long definition of CIR - the RRIMS properties • User Revealment: help users express their information needs • E.g., query suggestion, autocompletion • System Revealment: reveal to users what is available, what it can or cannot do • E.g., recommendation, SERP • Mixed Initiative: system and user both can take initiative (two-way conversation) • E.g., asking clarification questions • Memory: users can reference past statement • State tracking • Set Retrieval: system can reason about the utility of sets of complementary items • Task-oriented, contextual search or QA [Radlinski&Craswell 17] CIR research tasks (task-oriented dialog modules) • What we will cover in this tutorial • Conversational Query Understanding (LU, belief state tracking) • Conversational document ranking (database state tracking) • Learning to ask clarification questions (action select via dialog policy, LG) • Conversational leading suggestions (action select via dialog policy, LG) • Search result presentation (response generation, LG) • Early work on CIR [Croft’s keynote at SIGIR-19] • We start with conversational QA which is a sub-task of CIR Outline • Part 1: Introduction • Part 2: Conversational QA methods • Conversational QA over knowledge bases • Conversational QA over texts • Part 3: Conversational search methods • Part 4: Case study of commercial systems Conversational QA over Knowledge Bases • Knowledge bases and QAs • C-KBQA system architecture • Semantic parser • Dialog manager • Response generation • KBQA w/o semantic parser • Open benchmarks Knowledge bases • Relational databases • Entity-centric knowledge base • Q: what super-hero from Earth appeared first? • Knowledge Graph • Properties of billions of entities • Relations among them • (relation, subject, object) tuples • Freebase, FB Entity Graph, MS Satori, Google KG etc. • Q: what is Obama’s citizenship? • KGs work with paths while DBs work with sets [Iyyer+18; Gao+19] Question-Answer pairs • Simple questions • can be answered from a single tuple • Object? / Subject? / Relation? • Complex questions • requires reasoning over one or more tuples • Logical / quantitively / comparative • Sequential QA pairs • A sequence of related pairs • Ellipses, coreference, clarifications, etc. [Saha+18] C-KBQA system architecture Select Movie Where • Semantic Parser Find me the {direct = Bill Murray} Bill Murray’s Semantic Parser • map input + context to a semantic movie. representation (logic form) to • Query the KB Dialog state • Dialog manager KB • Maintain/update state of dialog tracker history (e.g., QA pairs, DB state) • Select next system action (e.g., ask clarification questions, answer) Dialog Policy Dialog Dialog Manager • Response generator (action selection) • Convert system action to natural language response When was it Response • KB search (Gao+19) released? Generator Dynamic Neural Semantic Parser (DynSP) • Given a question (dialog history) and a table • Q: “which superheroes came from Earth and first appeared after 2009?” • Generate a semantic parse (SQL-query) • A select statement (answer column) • Zero or more conditions, each contains • A condition column • An operator (=, >, <, argmax etc.) and arguments • Q: Select Character Where {Home World = “Earth”} & {First Appear > “2009”} • A: {Dragonwing, Harmonia} [Iyyer+18; Andreas+16; Yih+15] Model formulation • Parsing as a state-action search problem • A state 푆 is a complete or partial parse (action sequence) • An action 퐴 is an operation to extend a parse • Parsing searches an end state with the highest score • “which superheroes came from Earth and first Types of actions and the number of action instances appeared after 2009?” in each type. Numbers / datetimes are the mentions • (퐴1) Select-column Character discovered in the question. • (퐴2) Cond-column Home World • (퐴3) Op-Equal “Earth” • (퐴2) Cond-column First Appeared • (퐴5) Opt-GT “2009” Possible action transitions based on their types. Shaded circles are end states. [Iyyer+18; Andreas+16; Yih+15] How to score a state (parse)? • Beam search to find the highest-scored parse (end state) • 푉휃 푆푡 = 푉휃 푆푡−1 + 휋휃(푆푡−1, 퐴푡), 푉 푆0 = 0 • Policy function, 휋휃(푆, 퐴), • Scores an action given the current state • Parameterized using different neural networks, each for an action type • E.g., Select-column action is scored using the semantic similarity between question words (embedding vectors) and column name (embedding vectors) 1 σ 푇 • 푤푐∈푊푐 max 푤푞 푤푐 푊푐 푤푞∈푊푞 [Iyyer+18; Andreas+16; Yih+15] Model learning 푡 • State value function: 푉휃(푆푡) = σ푖=1 휋휃(푆푖−1, 퐴푖) • An E2E trainable, question-specific, neural network model • Weakly supervised learning setting • Question-answer pairs are available • Correct parse for each question is not available • Issue of delayed (sparse) reward • Reward is only available after we get a (complete) parse and the answer • Approximate (dense) reward • Check the overlap of the answers of a partial parse 퐴(푆) with the gold answers 퐴∗ 퐴 푆 ∩퐴∗ • 푅 푆 = 퐴 푆 ∪퐴∗ [Iyyer+18; Andreas+16; Yih+15] Parameter updates • Make the state value function 푉휃 behave similarly to reward 푅 • For every state 푆 and its (approximated) reference state 푆∗, we define loss as ∗ ∗ • ℒ 푆 = 푉휃 푆 − 푉휃 푆 − 푅 푆 − 푅 푆 • Improve learning efficiency by finding the most violated state 푆መ // labeled QA pair // Finds the best approximated