Arxiv:1909.02304V1 [Cs.CL] 5 Sep 2019 1 Introduction Ample of Parts of a Game’S Statistics and Its Corre- Sponding Computer Generated Summary
Total Page:16
File Type:pdf, Size:1020Kb
Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions (Row, Column and Time) Heng Gong, Xiaocheng Feng, Bing Qin,∗ Ting Liu Harbin Institute of Technology, China fhgong, xcfeng, qinb, [email protected] Team POINTS WINS LOSSES … The Charlotte Hornets ( 21 - 27 ) defeated the Abstract Washington Wizards ( 31 - 18 ) 92 - 88 on Wizards 88 31 18 … Wednesday … The Hornets were led by the duo Hornets 92 21 27 of John Wall and Bradley Beal . Wall went 4 - for - 14 from the field and 1 - for - 4 from the Although Seq2Seq models for table-to-text Player PTS AST REB … three - point line to score a game - high of 16 point … Gerald Henderson had a solid generation have achieved remarkable progress, Wizards showing as well , finishing with 17 points ( 6 - Paul Pierce 11 1 3 … 13 FG , 1 - 2 3Pt , 4 - 4 FT ) and five assists . It modeling table representation in one dimen- Nene 8 1 7 … was his second double - double in a row… Bradley Beal 18 1 11 … Baseline result (CC) sion is inadequate. This is because (1) the table John Wall 16 10 1 … The Charlotte Hornets ( 21 - 27 ) defeated the consists of multiple rows and columns, which … … … … … Washington Wizards ( 31 - 18 ) 92 - 88 on Kris Humphries 13 1 5 … Monday …The Hornets were led by Al Jefferson in this game , who went 9 - for - 19 Hornets means that encoding a table should not de- from the floor to score 18 points ... It was the Michael Kidd-Gilchrist 13 3 13 … second time in the last three games he ’s pend only on one dimensional sequence or set Al Jefferson 18 1 12 … posted a double - double , while the two steals Gerald Henderson 17 5 2 … matched a season - high for the center … Beal of records and (2) most of the tables are time has turned it on over his last two games , Brian Roberts 18 3 1 … combining for 44 points and 14 rebounds ... series data (e.g. NBA game data, stock mar- This double - double marked the second in a … … … … … row for Wall , who 's combined for 44 points ket data), which means that the description of Gary Neal 12 1 0 … and 22 asssists over his last two games … the current table may be affected by its histori- Tables Gold cal data. To address aforementioned problems, Figure 1: Generated example on ROTOWIRE by us- not only do we model each table cell consid- ing Conditional Copy (CC) as baseline (Wiseman et al., ering other records in the same row, we also 2017). Text that accurately reflects records in the table enrich table’s representation by modeling each is in red, and text that contradicts the records is in blue. table cell in context of other cells in the same column or with historical (time dimension) data respectively. In addition, we develop a ta- forecasting and medical monitoring, etc. The lat- ble cell fusion gate to combine representations ter generates text directly from the table through from row, column and time dimension into one a standard neural encoder-decoder framework to dense vector according to the saliency of each avoid error propagation and has achieved remark- dimension’s representation. We evaluated our methods on ROTOWIRE, a benchmark dataset able progress. In this paper, we particularly focus of NBA basketball games. Both automatic and on exploring how to improve the performance of human evaluation results demonstrate the ef- neural methods on table-to-text generation. fectiveness of our model with improvement of Recently, ROTOWIRE, which provides tables 2.66 in BLEU over the strong baseline and out- of NBA players’ and teams’ statistics with a de- performance of state-of-the-art model. scriptive summary, has drawn increasing attention from academic community. Figure1 shows an ex- arXiv:1909.02304v1 [cs.CL] 5 Sep 2019 1 Introduction ample of parts of a game’s statistics and its corre- sponding computer generated summary. We can Table-to-text generation is an important and chal- see that the tables has a formal structure includ- lenging task in natural language processing, which ing table row header, table column header and ta- aims to produce the summarization of numeri- ble cells. “Al Jefferson” is a table row header cal table (Reiter and Dale, 2000; Gkatzia, 2016). that represents a player, “PTS” is a table column The related methods can be empirically divided header indicating the column contains player’s into two categories, pipeline model and end-to- score and “18” is the value of the table cell, that end model. The former consists of content selec- is, Al Jefferson scored 18 points. Several related tion, document planning and realisation, mainly models have been proposed . They typically en- for early industrial applications, such as weather code the table’s records separately or as a long se- ∗ Email corresponding. quence and generate a long descriptive summary by a standard Seq2Seq decoder with some mod- important representation from those three dimen- ifications. Wiseman et al.(2017) explored two sion and combine them into a dense vector. In the types of copy mechanism and found conditional third layer, we use mean pooling method to merge copy model (Gulcehre et al., 2016) perform better the previously obtained table cell representations . Puduppully et al.(2019) enhanced content se- in the same row into the representation of the ta- lection ability by explicitly selecting and planning ble’s row. Then, we use self-attention with content relevant records. Li and Wan(2018) improved the selection gate (Puduppully et al., 2019) to filter precision of describing data records in the gen- unimportant rows’ information. To the best of our erated texts by generating a template at first and knowledge, this is the first work on neural table- filling in slots via copy mechanism. Nie et al. to-text generation via modeling column and time (2018) utilized results from pre-executed opera- dimension information so far. We conducted ex- tions to improve the fidelity of generated texts. periments on ROTOWIRE. Results show that our However, we claim that their encoding of tables as model outperforms existing systems, improving sets of records or a long sequence is not suitable. baseline BLEU from 14.19 to 16.85 (+18:75%), Because (1) the table consists of multiple play- P% of relation generation (RG) from 74.80 to ers and different types of information as shown in 91.46 (+22:27%), F1% of content selection (CS) Figure1. The earlier encoding approaches only from 32.49 to 41.21 (+26:84%) and content order- considered the table as sets of records or one di- ing (CO) from 15.42 to 20.86 (+35:28%) on test mensional sequence, which would lose the infor- set. It also exceeds the state-of-the-art model in mation of other (column) dimension. (2) the ta- terms of those metrics. ble cell consists of time-series data which change over time. That is to say, sometimes historical data can help the model select content. Moreover, 2 Preliminaries when a human writes a basketball report, he will not only focus on the players’ outstanding per- formance in the current match, but also summa- 2.1 Notations rize players’ performance in recent matches. Lets take Figure1 again. Not only do the gold texts The input to the model are tables S = fs1; s2; s3g. mention Al Jefferson’s great performance in this s1, s2, and s3 contain records about players’ per- match, it also states that “It was the second time in formance in home team, players’ performance in the last three games he’s posted a double-double”. visiting team and team’s overall performance re- Also gold texts summarize John Wall’s “double- spectively. We regard each cell in the table as double” performance in the similar way. Summa- record. Each record r consists of four types of in- rizing a player’s performance in recent matches re- formation including value r:v (e.g. 18), entity r:e quires the modeling of table cell with respect to (e.g. Al Jefferson), type r:c (e.g. POINTS) and a its historical data (time dimension) which is ab- feature r:f (e.g. visiting) which indicate whether sent in baseline model. Although baseline model a player or a team compete in home court or not. Conditional Copy (CC) tries to summarize it for Each player or team takes one row in the table Gerald Henderson, it clearly produce wrong state- and each column contains a type of record such ments since he didn’t get “double-double” in this as points, assists, etc. Also, tables contain the date match. when the match happened and we let k denote the date of the record. We also create timelines for To address the aforementioned problems, we records. The details of timeline construction is de- present a hierarchical encoder to simultaneously scribed in Section 2.2. For simplicity, we omit ta- model row, column and time dimension informa- ble id l and record date k in the following sections th th tion. In detail, our model is divided into three lay- and let ri;j denotes a record of i row and j col- ers. The first layer is used to learn the represen- umn in the table. We assume the records come tation of the table cell. Specifically, we employ from the same table and k is the date of the men- three self-attention models to obtain three repre- tioned record. Given those information, the model sentations of the table cell in its row, column and is expected to generate text y = (y1; :::; yt; :::; yT ) time dimension.