Logic2text: High-Fidelity Natural Language Generation from Logical Forms

Logic2Text: High-Fidelity Natural Language Generation from Logical Forms Zhiyu Chen1, Wenhu Chen1, Hanwen Zha1, Xiyou Zhou1, Yunkai Zhang1, Sairam Sundaresan2, and William Yang Wang1 1University of California, Santa Barbara 2Intel AI fzhiyuchen, wenhuchen, hwzha, xiyou, yunkai zhang, [email protected], [email protected] Abstract WikiBio (Lebret et al., 2016). In real-world cases for multi-row tables, it is often more desirable and Previous studies on Natural Language Gen- eration (NLG) from structured data have pri- plausible to provide descriptions involving higher- marily focused on surface-level descriptions level logical inference across data records. For of record sequences. However, for complex example, in Figure1, instead of plain restatements, structured data, e.g., multi-row tables, it is human readers would be more favorable to abstract often desirable for an NLG system to de- descriptions that can summarize or conclude infor- scribe interesting facts from logical inferences mation over the table records. To produce such across records. If only provided with the logical-level generations of high fidelity, it is not table, it is hard for existing models to pro- yet appropriate to provide only the table as the duce controllable and high-fidelity logical generations. In this work, we formulate high- input in a real-world NLG system, based on the fidelity NLG as generation from logical forms following reasons: in order to obtain controllable and faithful 1) Low Fidelity. Given only the table, it is generations. We present a new large-scale challenging for existing neural models to produce dataset, LOGIC2TEXT, with 10,753 descrip- such logically correct generations involving rea- tions involving common logic types paired soning and symbolic calculations, e.g., max, min, with the underlying logical forms. The counting, averaging, etc. logical forms show diversified graph struc- ture of free schema, which pose great chal- 2) Uncontrollable content selection. Given lenges on the model’s ability to understand a table, the space of logically entailed descrip- the semantics. We experiment on (1) Fully- tions is exponentially large, due to vast number supervised training with the full datasets, of combinations of different operations and argu- and (2) Few-shot setting, provided with hun- ments from the table, e.g., count, comparison, dreds of paired examples; We compare sev- superlative, etc. It is hard and uncontrol- eral popular generation models and analyze lable for neural models to decide a valid, favorable their performances. We hope our dataset can encourage research towards building an ad- choice of logical selections solely based on the ta- vanced NLG system capable of natural, faith- ble, due to the difficulty of imposing high-level ful, and human-like generation. The dataset semantic constraints in the compositional genera- and code is available at https://github. tion process. arXiv:2004.14579v2 [cs.CL] 24 Sep 2020 com/czyssrs/Logic2Text. To combat with the above problems, we argue that it is necessary to leverage intermediate mean- 1 Introduction ing representations to achieve faithful and control- Natural language generation (NLG) from struc- lable logical generations. To this end, we formulate tured data has been an important research problem the task of logical-level NLG as a logical form to in many applications. Recent data-driven methods text problem. Specifically, besides the table infor- have achieved good performances on various NLG mation, the generation module is provided with tasks (Liu et al., 2018; Freitag and Roy, 2018; Chen a logical form representing the semantics of the et al., 2019b). However most studies focus on sur- target text (see Figure1 for an example). By sepa- face descriptions of simple record sequences, for rating logical reasoning and language realization, example, attribute-value pairs of fixed or very lim- the correctness of the intermediate logical form is ited schema, like E2E (Novikova et al., 2017) and guaranteed, and the challenge for the realization module is fully shifted to semantic understanding. table caption: opec joined population area (km To facilitate research in this direction, we pro- country region opec (july 2012) square) pose a new dataset named LOGIC2TEXT, consist- algeria africa 1969 37367226 2381740 ing of 5.6k open-domain tables, 10.8k manually angola africa 2007 18056072 1246700 annotated (logical form, description) pairs. Our iraq middle east 1960 31129225 437072 dataset is of high quality in terms of (1) natural and libya africa 1962 5613380 1759540 interesting descriptions; (2) accurate logical forms nigeria africa 1971 170123740 923768 ... ... ... ... ... with 100% execution correctness. In our dataset, Surface-level NLG the coarse logic types are 7 common ones to de- Description: angola, from the region africa, joined opec in scribe multi-row tables: count, superlative, 2007, with an population of 18056072 in 2012. comparative, aggregation, majority, Description: algeria, from the region africa, joined opec in 1969, with an population of 37367226 in 2012. unique, and ordinal. We employ a Python- Logical-level NLG with logical forms ( our dataset ) like program to serve as our logical forms, which logical form: eq { count { filter_eq { all_rows ; region ; africa } } can be easily converted to other types of logi- ; 4 } = True eq cal forms. Figure1 shows two examples of our count 4 dataset. Compared with previous surface-level NLG datasets, one major distinction of our dataset filter_eq is the free schema of the logical forms, which can all_rows region africa be represented as diversified graph structures. The Description: In 2012 in opec, there were 4 member countries new dataset poses great challenges on the model’s from africa. logical form: and { eq { hop { argmax { all_rows ; joined opec } ability to understand the structural semantics in ; region } ; africa } ; eq { hop { argmax { all_rows ; joined opec graph representation. } ; country } ; angola } } = True and We employ an array of popular generation eq eq models as the baseline approaches. The experi- ments are conducted in (1) Fully-supervised set- africa hop hop angola ting. We train the models using the full dataset region argmax country to analyze their performances. (2) Few-shot set- all_rows joined opec ting. We simulate the low-resource scenario in Description: In 2012 in opec, angola, from africa, was the real-world use cases. Experimental results show latest country to join. that the logical forms are critical to acquiring Figure 1: Examples of surface-level NLG compared with high-fidelity generations. The pre-trained lan- NLG with logical forms of our dataset. Here are two examples guage model outperforms other baselines (pointer- with logic type count and superlative. The function generator, graph2seq, transformer, etc.), but still nodes are in blue, and the text nodes in grey. makes factual and logical errors. In summary, our contributions are the following: the ground truth logical forms. This can be one • We propose a new large-scale dataset, direct application of our dataset. In this work, we focus on NLG. LOGIC2TEXT, with descriptions of common logic types accompanied by the underlying 2 Related Work logical forms. The logical forms present diversified graph structures, which raises more NLG from structured data or knowledge has been challenges on semantic understandings. studied for many years. There are various applications, such as the automatic generations of weather • We surveyed several popular generation mod- reports (Liang et al., 2009), sport reports (Wiseman els as the baselines under fully-supervised and et al., 2017), clinical and health reports (DiMarco few-shot settings, as well as analyze their pros et al., 2007; Lee, 2018), response generation in and cons. task-oriented dialogue systems (Wen et al., 2015; Our dataset can also be used in the reverse way Budzianowski et al., 2018; Dusekˇ et al., 2019), etc. (text to logical form) to facilitate tasks related to Traditional methods typically employ a pipeline- semantic parsing. Chen et al.(2019a) propose the based approach including content selection, plan- task of fact verification against tables, however the ning and surface realization (Reiter and Dale, 1997; performance is greatly limited due to the lack of Gatt and Krahmer, 2018). Recent data-driven methods tend to conflate the pipeline modules DeepBank (Flickinger et al., 2012), etc. In con- into one end-to-end neural networks, such as (Liu trast, our work focus on the logical formulations et al., 2018; Wiseman et al., 2017, 2018; Gong executed on database style tables, and common et al., 2019). Most recently, large-scale pre-trained symbolic operations on tables, such as count, su- models (Radford et al., 2019; Song et al., 2019; perlative, comparison. As nowadays much of the Raffel et al., 2019) have achieved new state-of- production data is stored in table based DB, we the-arts on various generation tasks. Chen et believe such a dataset should help building systems al. (2019b) demonstrate that a simple pre-training with table based data. based method can achieve very reasonable performance on the WikiBio dataset (Lebret et al., 2016) 3 Dataset Construction under few-shot setting. More recent works begin to The table source of LOGIC2TEXT is from Wik- focus on fidelity preserving of the generation, such 1 as (Dhingra et al., 2019; Tian et al., 2019). Their iTables (Bhagavatula et al., 2013), a collection work obtains good performances on surface-level of open-domain tables crawled from Wikipedia. NLG. In contrast, our work focus on the fidelity of We follow (Chen et al., 2019a) to filter out over- logical-level generations. complicated tables and take a subset of tables with less than 20 rows and 10 columns. There are a few popular NLG datasets mostly In this dataset, we start from 7 types of most on surface-level generation. Such as Weath- commonly used logics (Chen et al., 2019a) to de- erGov (Liang et al., 2009), E2E (Novikova scribe multi-row tables: count, superlative, et al., 2017), WikiBio (Lebret et al., 2016), and comparative, aggregation, majority, ToTTo (Parikh et al., 2020).

Logic2text: High-Fidelity Natural Language Generation from Logical Forms

Thinkjs 2.0 Documentation

Compiler Error Messages Considered Unhelpful: the Landscape of Text-Based Programming Error Message Research

Mapping Topic Maps to Common Logic

Towards the Second Edition of ISO 24707 Common Logic

Fundamentals of C Programming

Interoperability As Desideratum, Problem, and Process

Using Common Logic

View, Laboratory Or Research Protocol

Semanticsemantic Technologytechnology

Preface Bruce Bargmeyer

A TOOL for AURALIZED DEBUGGING by YAN CHEN a Thesis

通信英语缩语手册通信英语缩语手册A&EM Alarm & Event