Data-To-Text Generation with Content Selection and Planning

The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19) Data-to-Text Generation with Content Selection and Planning Ratish Puduppully, Li Dong, Mirella Lapata Institute for Language, Cognition and Computation School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB [email protected], [email protected], [email protected] Abstract Content Content The Houston Rockets ( 3 - 3 ) stunned the Recent advances in data-to-text generation have led to the use Selection Planning Los Angeles Clippers ( 3 - 3 ) Thursday in of large-scale datasets and neural network models which are Gate Network Game 6 at the trained end-to-end, without explicitly modeling what to say Staples Center … and in what order. In this work, we present a neural network architecture which incorporates content selection and plan- Col1 Col2 Record ♣ ning without sacrificing end-to-end training. We decompose Text the generation task into two stages. Given a corpus of data Record ♥ Row1 ♦ ■ Decoder records (paired with descriptive documents), we first generate Row2 ♣ ♥ Record ■ a content plan highlighting which information should be men- tioned and in which order and then generate the document while taking the content plan into account. Automatic and Figure 1: Block diagram of our approach. human-based evaluation experiments show that our model1 outperforms strong baselines improving the state-of-the-art on the recently released ROTOWIRE dataset. al. (2017) show that neural text generation techniques per- form poorly at content selection, they struggle to maintain 1 Introduction inter-sentential coherence, and more generally a reasonable ordering of the selected facts in the output text. Additional Data-to-text generation broadly refers to the task of automat- challenges include avoiding redundancy and being faithful ically producing text from non-linguistic input (Reiter and to the input. Interestingly, comparisons against template- Dale 2000; Gatt and Krahmer 2018). The input may be in based methods show that neural techniques do not fare well various forms including databases of records, spreadsheets, on metrics of content selection recall and factual output gen- expert system knowledge bases, simulations of physical sys- eration (i.e., they often hallucinate statements which are not tems, and so on. Table 1 shows an example in the form of a supported by facts in the database). database containing statistics on NBA basketball games, and In this paper, we address these shortcomings by explic- a corresponding game summary. itly modeling content selection and planning within a neu- Traditional methods for data-to-text generation (Kukich ral data-to-text architecture. Our model learns a content plan 1983; McKeown 1992) implement a pipeline of modules from the input and conditions on the content plan in order to including content planning (selecting specific content from generate the output document (see Figure 1 for an illustra- some input and determining the structure of the output text), tion). An explicit content planning mechanism has at least sentence planning (determining the structure and lexical three advantages for multi-sentence document generation: it content of each sentence) and surface realization (converting represents a high-level organization of the document struc- the sentence plan to a surface string). Recent neural genera- ture allowing the decoder to concentrate on the easier tasks tion systems (Lebret et al. 2016; Mei et al. 2016; Wiseman et of sentence planning and surface realization; it makes the al. 2017) do not explicitly model any of these stages, rather process of data-to-document generation more interpretable they are trained in an end-to-end fashion using the very suc- by generating an intermediate representation; and reduces cessful encoder-decoder architecture (Bahdanau et al. 2015) redundancy in the output, since it is less likely for the con- as their backbone. tent plan to contain the same information in multiple places. Despite producing overall fluent text, neural systems We train our model end-to-end using neural networks have difficulty capturing long-term structure and generat- and evaluate its performance on ROTOWIRE (Wiseman et ing documents more than a few sentences long. Wiseman et al. 2017), a recently released dataset which contains statis- Copyright c 2019, Association for the Advancement of Artificial tics of NBA basketball games paired with human-written Intelligence (www.aaai.org). All rights reserved. summaries (see Table 1). Automatic and human evaluation 1Our code is publicly available at https://github.com/ratishsp/ shows that modeling content selection and planning im- data2text-plan-py. proves generation considerably over competitive baselines. 6908 TEAM WIN LOSS PTS FG PCT RB AST ::: The Boston Celtics defeated the host Indiana Pacers 105-99 at Bankers Life Field- Pacers 4 6 99 42 40 17 ::: house on Saturday. In a battle between two injury-riddled teams, the Celtics were Celtics 5 4 105 44 47 22 ::: able to prevail with a much needed road victory. The key was shooting and de- PLAYER H/V AST RB PTS FG CITY ::: fense, as the Celtics outshot the Pacers from the field, from three-point range and Jeff Teague H 4 3 20 4 Indiana ::: from the free-throw line. Boston also held Indiana to 42 percent from the field and Miles Turner H 1 8 17 6 Indiana ::: 22 percent from long distance. The Celtics also won the rebounding and assisting Isaiah Thomas V 5 0 23 4 Boston ::: differentials, while tying the Pacers in turnovers. There were 10 ties and 10 lead Kelly Olynyk V 4 6 16 6 Boston ::: changes, as this game went down to the final seconds. Boston (5–4) has had to deal Amir Johnson V 3 9 14 4 Boston ::: with a gluttony of injuries, but they had the fortunate task of playing a team just ::::::::::::::::::::: as injured here. Isaiah Thomas led the team in scoring, totaling 23 points and five PTS: points, FT PCT: free throw percentage, RB: re- assists on 4–of–13 shooting. He got most of those points by going 14–of–15 from bounds, AST: assists, H/V: home or visiting, FG: field the free-throw line. Kelly Olynyk got a rare start and finished second on the team goals, CITY: player team city. with his 16 points, six rebounds and four assists. Table 1: Example of data-records and document summary. Entities and values corresponding to the plan in Table 2 are boldfaced. 2 Related Work neural encoder-decoder model to generate weather forecasts and soccer commentaries, while Wiseman et al. (2017) gen- The generation literature provides multiple examples of con- erate NBA game summaries (see Table 1). They introduce a tent selection components developed for various domains new dataset for data-to-document generation which is suf- which are either hand-built (Kukich 1983; McKeown 1992; ficiently large for neural network training and adequately Reiter and Dale 1997; Duboue and McKeown 2003) or challenging for testing the capabilities of document-scale learned from data (Barzilay and Lapata 2005; Duboue and text generation (e.g., the average summary length is 330 McKeown 2001; 2003; Liang et al. 2009; Angeli et al. 2010; words and the average number of input records is 628). Kim and Mooney 2010; Konstas and Lapata 2013). Like- Moreover, they propose various automatic evaluation mea- wise, creating summaries of sports games has been a topic sures for assessing the quality of system output. Our model of interest since the early beginnings of generation systems follows on from Wiseman et al. (2017) addressing the chal- (Robin 1994; Tanaka-Ishii et al. 1998). lenges for data-to-text generation identified in their work. Earlier work on content planning has relied on generic We are not aware of any previous neural network-based planners (Dale 1988), based on Rhetorical Structure Theory approaches which incorporate content selection and plan- (Hovy 1993) and schemas (McKeown et al. 1997). Content ning mechanisms and generate multi-sentence documents. planners are defined by analysing target texts and devising Perez-Beltrachini and Lapata (2018) introduce a content se- hand-crafted rules. Duboue and McKeown (2001) study or- lection component (based on multi-instance learning) with- dering constraints for content plans and in follow-on work out content planning, while Liu et al. (2017) propose a sen- (Duboue and McKeown 2002) learn a content planner from tence planning mechanism which orders the contents of a an aligned corpus of inputs and human outputs. A few re- Wikipedia infobox so as to generate a single sentence. searchers (Mellish et al. 1998; Karamanis 2004) select content plans according to a ranking function. 3 Problem Formulation More recent work focuses on end-to-end systems instead of individual components. However, most models make sim- The input to our model is a table of records (see Table 1 plifying assumptions such as generation without any con- left hand-side). Each record rj has four features including tent selection or planning (Belz 2008; Wong and Mooney its type (rj;1; e.g., LOSS, CITY), entity (rj;2; e.g., Pacers, 2007) or content selection without planning (Konstas and Miles Turner), value (rj;3; e.g., 11, Indiana), and whether a player is on the home- or away-team (rj;4; see column H/V Lapata 2012; Angeli et al. 2010; Kim and Mooney 2010). 4 An exception are Konstas and Lapata (2013) who incorpo- in Table 1), represented as frj;kgk=1. The output y is a doc- rate content plans represented as grammar rules operating on ument containing words y = y1 ··· yjyj where jyj is the doc- the document level. Their approach works reasonably well ument length. Figure 2 shows the overall architecture of our with weather forecasts, but does not scale easily to larger model which consists of two stages: (a) content selection databases, with richer vocabularies, and longer text descrip- and planning operates on the input records of a database and tions.

Load more