Sparc: Cross-Domain Semantic Parsing in Context

SParC: Cross-Domain Semantic Parsing in Context Tao Yuy Rui Zhangy Michihiro Yasunagay Yi Chern Tany Xi Victoria Lin{ Suyi Liy Heyang Ery Irene Liy Bo Pangy Tao Cheny Emily Jiy Shreya Dixity David Proctory Sungrok Shimy Jonathan Krafty Vincent Zhangy Caiming Xiong{ Richard Socher{ Dragomir Radevy yDepartment of Computer Science, Yale University {Salesforce Research ftao.yu, r.zhang, michihiro.yasunaga, [email protected] fxilin, cxiong, [email protected] Abstract D1 : Database about student dormitory containing 5 tables. We present SParC, a dataset for cross-domain C1 : Find the first and last names of the students who are living S Par C in the dorms that have a TV Lounge as an amenity. emantic sing in ontext. It consists of 4,298 coherent question sequences (12k+ indi- Q : How many dorms have a TV Lounge? 1 vidual questions annotated with SQL queries), S : SELECT COUNT(*) FROM dorm AS T1 JOIN has_amenity 1 AS T2 ON T1.dormid = T2.dormid JOIN dorm_amenity obtained from controlled user interactions AS T3 ON T2.amenid = T3.amenid WHERE with 200 complex databases over 138 do- T3.amenity_name = 'TV Lounge' Q : What is the total capacity of these dorms? mains. We provide an in-depth analysis of 2 S : SELECT SUM(T1.student_capacity) FROM dorm AS T1 SParC and show that it introduces new chal- 2 JOIN has_amenity AS T2 ON T1.dormid = T2.dormid lenges compared to existing datasets. SParC JOIN dorm_amenity AS T3 ON T2.amenid = T3.amenid WHERE T3.amenity_name = 'TV Lounge' (1) demonstrates complex contextual depen- Q : How many students are living there? dencies, (2) has greater semantic diversity, and 3 S : SELECT COUNT(*) FROM student AS T1 JOIN lives_in (3) requires generalization to new domains 3 AS T2 ON T1.stuid = T2.stuid WHERE T2.dormid IN due to its cross-domain nature and the un- (SELECT T3.dormid FROM has_amenity AS T3 JOIN dorm_amenity AS T4 ON T3.amenid = T4.amenid WHERE seen databases at test time. We experiment T4.amenity_name = 'TV Lounge') with two state-of-the-art text-to-SQL mod- Q : Please show their first and last names. 4 els adapted to the context-dependent, cross- S : SELECT T1.fname, T1.lname FROM student AS T1 JOIN 4 lives_in AS T2 ON T1.stuid = T2.stuid WHERE domain setup. The best model obtains an T2.dormid IN (SELECT T3.dormid FROM has_amenity exact set match accuracy of 20.2% over all AS T3 JOIN dorm_amenity AS T4 ON T3.amenid = T4.amenid WHERE T4.amenity_name = 'TV Lounge') questions and less than 10% over all interaction sequences, indicating that the cross- -------------------------------------- D : Database about shipping company containing 13 tables domain setting and the contextual phenomena 2 C : Find the names of the first 5 customers. of the dataset present significant challenges 2 for future research. The dataset, baselines, Q1 : What is the customer id of the most recent customer? and leaderboard are released at https:// S : SELECT customer_id FROM customers ORDER BY 1 date_became_customer DESC LIMIT 1 yale-lily.github.io/sparc. Q2 : What is their name? S : SELECT customer_name FROM customers ORDER BY 1 Introduction 2 date_became_customer DESC LIMIT 1 Q : How about for the first 5 customers? Querying a relational database is often challeng- 3 S : SELECT customer_name FROM customers ORDER BY 3 ing and a natural language interface has long been date_became_customer LIMIT 5 regarded by many as the most powerful database interface (Popescu et al., 2003; Bertomeu et al., Figure 1: Two question sequences from the SParC 2006; Li and Jagadish, 2014). The problem of dataset. Questions (Qi) in each sequence query a mapping a natural language utterance into exe- database (Di), obtaining information sufficient to com- cutable SQL queries (text-to-SQL) has attracted plete the interaction goal (C ). Each question is anno- i increasing attention from the semantic parsing S tated with a corresponding SQL query ( i). SQL token community by virtue of a continuous effort of sequences from the interaction context are underlined. dataset creation (Zelle and Mooney, 1996; Iyyer et al., 2017; Zhong et al., 2017; Finegan-Dollak While most of these work focus on precisely et al., 2018; Yu et al., 2018a) and the modeling mapping stand-alone utterances to SQL queries, innovation that follows it (Xu et al., 2017; Wang generating SQL queries in a context-dependent et al., 2018; Yu et al., 2018b; Shi et al., 2018). scenario (Miller et al., 1996; Zettlemoyer and Collins, 2009; Suhr et al., 2018) has been studied tain information that answers the Spider question. less often. The most prominent context-dependent At the same time, the students are encouraged to text-to-SQL benchmark is ATIS1, which is set in come up with related questions which do not di- the flight-booking domain and contains only one rectly contribute to the Spider question so as to database (Hemphill et al., 1990; Dahl et al., 1994). increase data diversity. The questions were subse- In a real-world setting, users tend to ask a se- quently translated to complex SQL queries by the quence of thematically related questions to learn same student. Similar to Spider, the SQL Queries about a particular topic or to achieve a complex in SParC cover complex syntactic structures and goal. Previous studies have shown that by al- most common SQL keywords. lowing questions to be constructed sequentially, We split the dataset such that a database appears users can explore the data in a more flexible man- in only one of the train, development and test sets. ner, which reduces their cognitive burden (Hale, We provide detailed data analysis to show the rich- 2006; Levy, 2008; Frank, 2013; Iyyer et al., 2017) ness of SParC in terms of semantics, contextual and increases their involvement when interacting phenomena and thematic relations (x 4). We also with the system. The phrasing of such questions experiment with two competitive baseline models depends heavily on the interaction history (Kato to assess the difficulty of SParC (x 5). The best et al., 2004; Chai and Jin, 2004; Bertomeu et al., model achieves only 20.2% exact set matching ac- 2006). The users may explicitly refer to or omit curacy3 on all questions, and demonstrates a de- previously mentioned entities and constraints, and crease in exact set matching accuracy from 38.6% may introduce refinements, additions or substitu- for questions in turn 1 to 1.1% for questions in tions to what has already been said (Figure1). turns 4 and higher (x 6). This suggests that there is This requires a practical text-to-SQL system to ef- plenty of room for advancement in modeling and fectively process context information to synthesize learning on the SParC dataset. the correct SQL logic. To enable modeling advances in context- 2 Related Work dependent semantic parsing, we introduce SParC Context-independent semantic parsing Early (cross-domain Semantic Parsing in Context), studies in semantic parsing (Zettlemoyer and an expert-labeled dataset which contains 4,298 Collins, 2005; Artzi and Zettlemoyer, 2013; Be- coherent question sequences (12k+ questions rant and Liang, 2014; Li and Jagadish, 2014; Pa- paired with SQL queries) querying 200 complex supat and Liang, 2015; Dong and Lapata, 2016; databases in 138 different domains. The dataset is Iyer et al., 2017) were based on small and single- 2 built on top of Spider , the largest cross-domain domain datasets such as ATIS (Hemphill et al., context-independent text-to-SQL dataset available 1990; Dahl et al., 1994) and GeoQuery (Zelle and in the field (Yu et al., 2018c). The large num- Mooney, 1996). Recently, an increasing number ber of domains provide rich contextual phenom- of neural approaches (Zhong et al., 2017; Xu et al., ena and thematic relations between the questions, 2017; Yu et al., 2018a; Dong and Lapata, 2018; Yu which general-purpose natural language interfaces et al., 2018b) have started to use large and cross- to databases have to address. In addition, it en- domain text-to-SQL datasets such as WikiSQL ables us to test the generalization of the trained (Zhong et al., 2017) and Spider (Yu et al., 2018c). systems to unseen databases and domains. Most of them focus on converting stand-alone nat- We asked 15 college students with SQL expe- ural language questions to executable queries. Ta- rience to come up with question sequences over ble1 compares SParC with other semantic parsing the Spider databases (x 3). Questions in the orig- datasets. inal Spider dataset were used as guidance to the students for constructing meaningful interactions: Context-dependent semantic parsing with each sequence is based on a question in Spider and SQL labels Only a few datasets have been the student has to ask inter-related questions to ob- constructed for the purpose of mapping context- dependent questions to structured queries. 1A subset of ATIS is also frequently used in context- independent semantic parsing research (Zettlemoyer and 3Exact string match ignores ordering discrepancies of Collins, 2007; Dong and Lapata, 2016). SQL components whose order does not matter. Exact set 2The data is available at https://yale-lily. matching is able to consider ordering issues in SQL evalu- github.io/spider. ation. See more evaluation details in section 6.1. Dataset Context Resource Annotation Cross-domain SParC X database SQL X ATIS (Hemphill et al., 1990; Dahl et al., 1994) X database SQL 7 Spider (Yu et al., 2018c) 7 database SQL X WikiSQL (Zhong et al., 2017) 7 table SQL X GeoQuery (Zelle and Mooney, 1996) 7 database SQL 7 SequentialQA (Iyyer et al., 2017) X table denotation X SCONE (Long et al., 2016) X environment denotation 7 Table 1: Comparison of SParC with existing semantic parsing datasets.

Sparc: Cross-Domain Semantic Parsing in Context

Semantic Parsing for Single-Relation Question Answering

Semantic Parsing with Semi-Supervised Sequential Autoencoders

SQL Generation from Natural Language: a Sequence-To- Sequence Model Powered by the Transformers Architecture and Association Rules

Semantic Parsing with Neural Hybrid Trees

Learning Logical Representations from Natural Languages with Weak Supervision and Back-Translation

A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

Neural Semantic Parsing Graham Neubig

Practical Semantic Parsing for Spoken Language Understanding

Arxiv:2009.14116V1 [Cs.CL] 29 Sep 2020

Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

On the Use of Parsing for Named Entity Recognition

Neural Semantic Parsing for Syntax-Aware Code Generation