<<

AGGGEN: Ordering and Aggregating while Generating Xinnuo Xu, Ondřej Dušek, Verena Rieser, Yannis Konstas

Highlights of AGGGEN • Problem: hallucination and omission in data-to-text generation • Approach: HMM Sentence Planner + Transformer data-to-text generator • Architecture Pros: • End-to-end learning: jointly learn to plan and generate • No extra labels are required • Interpretable: directly evaluate model’s planning • Controllable: direct access to manually control the planning component • Performance: • Generation Evaluation: outperforms the baselines on most surface metrics • Factual Correctness : The Slot Error Rate is the best among all models

2 Data-to-Text Generation

Input DBpedia Triples Human-authored Text

William Anders dateOfRetirement 1969-09-01 William Anders, who retired on 8 Commander Frank September 1st, 1969, was a crew member on and served under William Anders member_of Apollo 8 commander . Apollo 8 Apollo 8 backup_pilot was operated by NASA with Buzz Aldrin as backup pilot. Apollo 8 Operator NASA

3 Neural E2E Data-to-Text Generation

Target Text William Anders is a crew member on Apollo 8 and served under commander Frank Borman.

Generated Text William Anders is the commander Hallucination

Autoregressive Decoder

Encoder

Input Triples William Anders member_of Apollo 8 Apollo 8 Commander Frank Borman

4 Neural E2E Data-to-Text Generation

Target Text William Anders is a crew member on Apollo 8 and served under commander Frank Borman.

Generated Text Willliam Anders is a crew member on Apollo 8. [EOS] Omission

Autoregressive Decoder

Encoder

Input Triples William Anders member_of Apollo 8 Apollo 8 Commander Frank Borman

5 Traditional Data-to-Text Generation

Input Triples William Anders member_of Apollo 8 Apollo 8 Commander Frank Borman

Document D

Sentence planning SENT(mem_of, comm) using Rhetorical Structure Theory Record(mem_of.type) Record(comm.type) based Grammar Fields(mem_of, start) Fields(comm, start)

F(mem_of, arg0) F(mem_of, pred) F(mem_of, arg1) F(comm, pred) F(comm, arg1)

Target Text [William Anders] [is a crew member] [on Apollo 8] [and served under commander] [Frank Borman.]

6 (Konstas and Lapata, 2013) Traditional Data-to-TextAGGGEN Generation

Neural E2E Generation

Sentence Planning

• End-to-end training • No extra labels required

• Interpretable • Controllable • Factually Correct • Fluent 7 Presentation Timeline Example

Example William Anders dateOfRetirement 1969-09-01

Apollo 8 Commander Frank Borman Model Input Triples William Anders member_of Apollo 8 Learning Apollo 8 backup_pilot Buzz Aldrin

Inference Apollo 8 Operator NASA

Generation Evaluation

Factuality Evaluation Ordering Aggregating Sentence Planning Study of Planning William Anders retired on 1969-09-01. He was a crew member of 's Apollo 8. Controllability Frank Borman was a commander with Buzz Aldrin as the backup pilot.

8 Presentation Timeline Example

Example William Anders dateOfRetirement 1969-09-01

Apollo 8 Commander Frank Borman Model Input Triples x William Anders member_of Apollo 8 Learning Apollo 8 backup_pilot Buzz Aldrin

Inference Apollo 8 Operator NASA

Generation Evaluation

member_of backup_pilot Factuality Evaluation Operator Commander Study of Planning William Anders retired on 1969-09-01. He was a crew member of nasa 's Apollo 8. Controllability Frank Borman was a commander with Buzz Aldrin as the backup pilot. He was a crew member of nasa 's Apollo 8. Frank B. was a commander with Buzz A. as the backup pilot. 9 Fact (t-1) Fact t Presentation Timeline Model: Overview

Example

Model Input Triples x

Learning

Inference HMM p(z1:T, y1:T |x) = p(z1:T |x) ⋅ p(y1:T |z1:T, x) Generation Evaluation Transition Emission member_of backup_pilot Factuality Evaluation z(t−1) zt Operator Commander Study of Planning

y(t−1) yt Controllability He was a crew member of nasa 's Apollo 8. Frank B. was a commander with Buzz A. as the backup pilot. 10 Fact (t-1) Fact t Presentation Timeline Model: Overview

Example

Model

Learning Predicates q

Inference HMM p(z1:T, y1:T |x) = p(z1:T |x) ⋅ p(y1:T |z1:T, x) Generation Evaluation Transition Emission member_of backup_pilot Factuality Evaluation z(t−1) zt Operator Commander Study of Planning

y(t−1) yt Controllability He was a crew member of nasa 's Apollo 8. Frank B. was a commander with Buzz A. as the backup pilot.

11 Presentation Timeline Model: Transition Distribution

Example T p(z1:T |x) = p(z1 |x)∏p(zt |z(t−1), x) Model t=2 • Traditional HMM: One latent state emits one observation.

Learning • AGGGEN: One latent state is a chain of latent variables that Predicates q emit one observation. Inference

o1 o2 1 2 Generation Evaluation (t−1) (t−1) ot ot

member_of p(zt |z(t−1), x) backup_pilot Factuality Evaluation z(t−1) zt Operator Commander Study of Planning

y(t−1) yt Controllability He was a crew member of nasa 's Apollo 8. Frank B. was a commander with Buzz A. as the backup pilot.

12 Presentation Timeline Model: Emission Distribution

Example

Model

Learning Predicates q

Inference HMM p(z1:T, y1:T |x) = p(z1:T |x) ⋅ p(y1:T |z1:T, x) Generation Evaluation Transition Emission

Factuality Evaluation z(t−1) zt

Study of Planning

y(t−1) yt Controllability He was a crew member of nasa 's Apollo 8. Frank B. was a commander with Buzz A. as the backup pilot.

13 Presentation Timeline Model: Emission Distribution

Example T p(y(1:T) |z(1:T), x) = ∏p(yt |zt, x) Model t=1 • Probability of generating a fact Learning • Conditioned on current latent state and the input triples Predicates q Inference • The product over token-level probabilities • Model as a Transformer encoder-decoder Generation Evaluation

member_of backup_pilot Factuality Evaluation z(t−1) zt Operator Commander p(y |z , x) p(y |z , x) Study of Planning (t−1) (t−1) t t

y(t−1) yt Controllability He was a crew member of nasa 's Apollo 8. Frank B. was a commander with Buzz A. as the backup pilot.

14 Presentation Timeline Learning: backward algorithm

Example Optimization Goal arg max p(y1:T |x) θ Model • Learning of parameters: Transition matrix and parameters in the Transformer Learning • Goal: Maximize the conditional marginal likelihood over all the latent states z and o • One unified framework and end-to-end training Inference

member_of Generation Evaluation C C′ dateOfRetire Operator Future Evidence βt (C) Factuality Evaluation z(t−1) zt z(t+1) p (zt = C ∣ zt−1 = C′, x)

Study of Planning p (yt ∣ zt = C, x)

y(t−1) yt y(t+1) Controllability William Anders retired… He was a crew member of… Frank B. was a commander with Buzz…

15 Presentation Timeline Inference Input Triples

Example Planning: Input Ordering William Anders dateOfRetirement 1969-09-01

Apollo 8 Commander Model • Order input predicates Frank Borman • Top-k most likely orderings William Anders member_of Apollo 8 Learning • Beam-search Apollo 8 backup_pilot Buzz Aldrin Apollo 8 Operator NASA Inference • Calculated by the transition matrix

Generation Evaluation

1 2 3 4 5 Factuality Evaluation o o o o o

dateOfRetirement member_of Operator backup_pilot Commander Study of Planning

Controllability

16 Presentation Timeline Inference Input Triples

Example Planning: Input Aggregation William Anders dateOfRetirement 1969-09-01

Apollo 8 Commander Model • Top-n most likely aggregations Frank Borman William Anders member_of Apollo 8 • Binary state for each predicate Learning Apollo 8 backup_pilot Buzz Aldrin • All possible combinations of the binary states sequence Apollo 8 Operator NASA Inference • Rank by the transition matrix Generation Evaluation

1 2 3 4 5 Factuality Evaluation o o o o o

dateOfRetirement member_of Operator backup_pilot Commander Study of Planning emit wait emit wait emit

Controllability

z1 z2 z3 17 Presentation Timeline Inference Input Triples

Example Text Generation William Anders dateOfRetirement 1969-09-01

Apollo 8 Commander Model • Generates a text description Frank Borman • Conditioned on planning results William Anders member_of Apollo 8 Learning • Beam-search Apollo 8 backup_pilot Buzz Aldrin Apollo 8 Operator NASA Inference • Mask out unrelated triples

Generation Evaluation dateOfRetirement member_of Operator backup_pilot Commander

Factuality Evaluation z1 z2 z3

Study of Planning

yt Controllability He was a crew member of…

18 Presentation Timeline Generation Evaluation

Example • Fluent: AGGGEN outperforms the baselines on most surface metrics Model Model BLEU TER METEOR

Learning Transformer 58.47 0.37 0.42

58.74 0.40 0.43 Inference AGGGEN Generation Evaluation Results on the WebNLG (seen). Generation Evaluation Model BLEU NIST MET R-L CIDer Factuality Evaluation Transformer 38.57 5.756 35.92 55.45 1.668

Study of Planning AGGGEN 41.06 6.207 37.91 55.13 1.844

Evaluation of Generation trained on the original E2E data, while tested on the Controllability cleaned E2E data.

19 Full results with more baselines can be found in Table 1, 2, 3 in the paper Presentation Timeline Factuality Evaluation

Example •Factually Correct: The SER of AGGGEN is the best among all models Model

Learning Model Add Miss Wrong SER Inference Transformer 0.30 4.67 0.20 5.16

Generation Evaluation AGGGEN 0.32 1.66 0.71 2.71

Evaluation of Factual correctness trained and tested on the original E2E data. Factuality Evaluation

Study of Planning

Controllability

20 Full results with more baselines can be found in Table 2 in the paper Presentation Timeline Study of Planning Example • Interpretable: The discrete latent space is interpretable and

Model allows direct evaluation of the AGGGEN planning

Learning • NMI (Normalized Mutual Information) to evaluate aggregation

Inference Sentence Plan 1 dateOfRetirement

member_of Generation Evaluation dateOfRetirement member_ofbackup_pilot Operator backup_pilot Commander Operator Commander Factuality Evaluation Sentence Plan 2 member_of Study of Planning Operator member_of backup_pilot Commander dateOfRetirement Operator Controllability backup_pilot dateOfRetirement

21 Commander Presentation Timeline Study of Planning Example • Interpretable: The discrete latent space is interpretable and

Model allows direct evaluation of the AGGGEN planning

Learning • Kendall’s tau (K-τ) to evaluate both ordering and aggregation

Inference Sentence Plan 1

Generation Evaluation dateOfRetirement member_of Operator backup_pilot Commander

1 2 2 3 3 Factuality Evaluation

Sentence Plan 2 Study of Planning

member_of backup_pilot Commander dateOfRetirement Operator Controllability 1 2 3 4 4 22 Presentation Timeline Study of Planning Example • Interpretable: The discrete latent space is interpretable and

Model allows direct evaluation of the AGGGEN planning

Learning • Kendall’s tau (K-τ) to evaluate both ordering and aggregation

Inference Sentence Plan 1

Generation Evaluation dateOfRetirement member_of Operator backup_pilot Commander

1 2 2 3 3 Factuality Evaluation

Sentence Plan 2 Study of Planning

dateOfRetirement member_of Operator backup_pilot Commander Controllability 4 1 4 2 3 23 Presentation Timeline Study of Planning Example • Interpretable: The discrete latent space is interpretable and

Model allows direct evaluation of the AGGGEN planning Reference text 1 Referenced plan 1 Learning Input Triples Reference text 2 Referenced plan 2 NMIavg K − τavg Inference Reference text 3 Referenced plan 3

Generation Evaluation

Factuality Evaluation Agreement among NMIavg K − τavg

Study of Planning Three reference plans 0.76 0.25

Model’s and reference plans 0.62 0.21 Controllability Planning Evaluation Results. NMI and K-tau calculated between human-written 24 references (top), and between references and our system AGGGEN (bottom). Presentation Timeline Study of Planning Example • Interpretable: The discrete latent space is interpretable and

Model allows direct evaluation of the AGGGEN planning Reference text 1 Referenced plan 1 Learning Input Triples Reference text 2 Referenced plan 2 NMIavg Inference Reference text 3 Referenced plan 3 K − τavg

Generation Evaluation Generated text Sentence plan

Factuality Evaluation Agreement among NMIavg K − τavg

Study of Planning Three reference plans 0.76 0.25

Model’s and reference plans 0.62 0.21 Controllability Planning Evaluation Results. NMI and K-tau calculated between human-written 25 references (top), and between references and our system AGGGEN (bottom). Presentation Timeline Controllability

Example William Anders dateOfRetirement 1969-09-01 Apollo 8 Commander Frank Borman Model Input Triples William Anders member_of Apollo 8 Learning Apollo 8 backup_pilot Buzz Aldrin

Inference Apollo 8 Operator NASA

Generation Evaluation Human-written Plans [member_of] [Operator] [backup_pilot] [Commander] [dataOfRetirement]

Factuality Evaluation member_of Operator backup_pilot Commander dateOfRetirement Study of Planning William Anders served as a crew member on Apollo 8 operated by nasa. The backup Controllability pilot was Buzz Aldrin. Frank Borman was also an Apollo 8 commander. William Anders retired on September 1st, 1969. 26 Presentation Timeline Controllability

Example William Anders dateOfRetirement 1969-09-01 Apollo 8 Commander Frank Borman Model Input Triples William Anders member_of Apollo 8 Learning Apollo 8 backup_pilot Buzz Aldrin

Inference Apollo 8 Operator NASA

Generation Evaluation Hyperparameter Aggregate at most two triples Ordering Aggregating

Factuality Evaluation

Study of Planning William Anders served as a crew member on nasa ’s Apollo 8. The backup pilot was Controllability Buzz Aldrin. Frank Borman was also an Apollo 8 commander. William Anders retired on September 1st, 1969. 27 [email protected] [email protected]ff.cuni.cz [email protected] [email protected]

Thanks for watching Code is available at https://github.com/XinnuoXu/AggGen