<<

Using Discourse Information for Paraphrase Extraction

Michaela Regneri & Rui Wang Saarland University DFKI GmbH (Saarbrücken, Germany) EMNLP-CoNNL 2012, Jeju, Korea Paraphrase Resources

- ...are important. (RTE, Machine Translation, Question Answering, ...) - many approaches create paraphrase resources from monolingual parallel corpora - hardly any approach exploits discourse information - we show that discourse information helps to extract sentential paraphrases and phrase-level paraphrase fragments

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 2 Paraphrasing & Discourse Knowledge

Cuddy agrees to give him one She gives Foreman one shot. chance to prove himself.

- distributional hypothesis applied to sentences & discourse context - coreference resolution

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 3 Paraphrasing & Discourse Knowledge

Once he goes, Foreman asks When leaves, Foreman to take over as head of pushes for his job. diagnostics. Cuddy agrees to give him one She gives Foreman one shot. chance to prove himself.

Foreman meets with Thirteen Foreman, Hadley, and Taub get and . the conference room ready and Foreman explains that he'll be in charge.

- distributional hypothesis applied to sentences & discourse context - coreference resolution

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 4 Outline

- Paraphrasing & Discourse Knowledge √ - System Overview - Evaluation

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 5 System Overview

recaps of House parallel corpus with parallel M.D. + Multiple Sequence Alignment discourse structures + semantic similarity

Discourse Information

The psychiatrist suggests him to get a hobby + word alignments + coreference resolution get a hobby Nolan tells House to take + dependency trees take up a hobby up a hobby.

paraphrase fragments sentence-level paraphrases

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 6 A Parallel Corpus

- different summaries of House MD episodes - entirely parallel discourse structure (linear sequential order, like events on screen) - intermediate length, lots of sources on the web - We’re working on Season 6: 20 episodes x 8 recaps (14735 sentences) - easy to extend (2 hours for data collection) - Preprocessing: sentence splitting, parsing

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 7 Sequence Alignment

- Sequence Alignment arranges two sequences so as to align as many sequences alignment similar (equal) elements as possible

- compute the alignment with the lowest cost, given costs / scores for

- gap introduction

- matching two items

- Multiple Sequence Alignment (MSA) generalizes this task for arbitrarily many sequences gaps

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 8 Sentence Matching with MSA (cf Regneri & al. 2010)

- recaps = sequences of sequential discourse semantic sentence information + similarity sentences s3.1 s1.1 s3.2 s1.2 s2.1 s2.1 - alignment score for two s3.3 s3.3 s1.3 s2.2 sentences = vector-based s2.3 s1.1 semantic similarity

- constant gap costs recap 1 recap 3 recap 3 sentence 1.1 ∅ ∅ - aligned sentences = paraphrases sentence 1.2 sentence 2.1 sentence 3.1 ∅ ∅ sentence 3.2 - high context similarity + high semantic similarity = sentence 1.3 ∅ sentence 3.3 alignment MSA with Paraphrases

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 9 Sample Results of the MSA

recap 1 recap 2 recap 3 recap 4 Foreman insists he She gives Foreman one Cuddy agrees to give him one deserves a chance and shot. chance to prove himself. Cuddy gives in, warning him he gets one shot. Foreman meets with Thirteen and Chris Taub. They decide that it might be CRPS and Foreman orders a spinal stimulation. Vince disagrees, checks Thirteen and Taub go to The millionaire has checked on the Internet, and see the patient, who He suggests they give him on the Internet and believes suggests mercury thinks he has mercury a blood test for mercury that he has mercury poisoning poisoning brought on by poisoning from eating too poisoning. caused by sushi. the sushi he eats much fish. constantly. Foreman is upset Thirteen He argues that his symptoms He's also researching his and Taub did the blood don't match up exactly with He asks them to run one case on the internet and test (which does not CRPS and asks them to give blood test to check for asks for a blood test to reveal any poisoning) him a blood test for mercury. rule out the diagnosis. without consulting him. heightened mercury levels.

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 10 Sample Results of the MSA

recap 1 recap 2 recap 3 recap 4 Foreman insists he She gives Foreman one Cuddy agrees to give him one deserves a chance and shot. chance to prove himself. Cuddy gives in, warning him he gets one shot. Foreman meets with Thirteen and Chris Taub. They decide that it might be CRPS and Foreman orders a spinal stimulation. Vince disagrees, checks Thirteen and Taub go to The millionaire has checked on the Internet, and see the patient, who He suggests they give him on the Internet and believes suggests mercury thinks he has mercury a blood test for mercury that he has mercury poisoning poisoning brought on by poisoning from eating too poisoning. caused by sushi. the sushi he eats much fish. constantly. Foreman is upset Thirteen He argues that his symptoms He's also researching his and Taub did the blood don't match up exactly with He asks them to run one case on the internet and test (which does not CRPS and asks them to give blood test to check for asks for a blood test to reveal any poisoning) him a blood test for mercury. rule out the diagnosis. without consulting him. heightened mercury levels.

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 10 Sample Results of the MSA

recap 1 recap 2 recap 3 recap 4 Foreman insists he She gives Foreman one Cuddy agrees to give him one deserves a chance and shot. chance to prove himself. Cuddy gives in, warning him he gets one shot. Foreman meets with Thirteen and Chris Taub. They decide that it might be CRPS and Foreman orders a spinal stimulation. Vince disagrees, checks Thirteen and Taub go to The millionaire has checked on the Internet, and see the patient, who He suggests they give him on the Internet and believes suggests mercury thinks he has mercury a blood test for mercury that he has mercury poisoning poisoning brought on by poisoning from eating too poisoning. caused by sushi. the sushi he eats much fish. constantly. Foreman is upset Thirteen He argues that his symptoms He's also researching his and Taub did the blood don't match up exactly with He asks them to run one case on the internet and test (which does not CRPS and asks them to give blood test to check for asks for a blood test to reveal any poisoning) him a blood test for mercury. rule out the diagnosis. without consulting him. heightened mercury levels.

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 10 Paraphrase Fragments

- Most aligned sentence pairs overlap, but they don’t cover exactly the same content - We want to extract smaller sentence parts (of different sizes) that match - Test advantages from Coreference Resolution

He argues that his symptoms don't match up exactly with He asks them to run CRPS and asks them to give one blood test to him a blood test for check for mercury. heightened mercury levels.

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 11 Paraphrase Fragments

- Most aligned sentence pairs overlap, but they don’t cover exactly the same content - We want to extract smaller sentence parts (of different sizes) that match - Test advantages from Coreference Resolution

He argues that his symptoms don't match up exactly with He asks them to run CRPS and asks them to one blood test to give him a blood test for check for mercury. heightened mercury levels.

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 11 Basic Fragment Extraction (cf Wang & Callison-Burch 2011)

- aligned recaps as parallel corpora sentence alignments for Machine Translation (“translate” s1.1 s2.1 s1.1 s3.1 ! s2.2 ! ! EN -> EN) s1.2 ! s1.2 s3.2 s1.3 s2.3 s1.3 s3.3 - compute word alignments for aligned sentences (Giza++) word alignments Vince He - a fragment pair is a sequence of tells asks them them aligned word pairs to to give run him a - do smoothing & different heuristics a blood blood test to determine fragment boundaries test to (-> minimal enclosing chunks) for check heightened for mercury mercury. - discard trivial fragments levels.

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 12 Basic Fragment Extraction (cf Wang & Callison-Burch 2011)

- aligned recaps as parallel corpora sentence alignments for Machine Translation (“translate” s1.1 s2.1 s1.1 s3.1 ! s2.2 ! ! EN -> EN) s1.2 ! s1.2 s3.2 s1.3 s2.3 s1.3 s3.3 - compute word alignments for aligned sentences (Giza++) word alignments Vince He - a fragment pair is a sequence of tells asks them them aligned word pairs to to give run him a - do smoothing & different heuristics a blood blood test to determine fragment boundaries test to (-> minimal enclosing chunks) for check heightened for mercury mercury. - discard trivial fragments levels.

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 12 Basic Fragment Extraction (cf Wang & Callison-Burch 2011)

- aligned recaps as parallel corpora sentence alignments for Machine Translation (“translate” s1.1 s2.1 s1.1 s3.1 ! s2.2 ! ! EN -> EN) s1.2 ! s1.2 s3.2 s1.3 s2.3 s1.3 s3.3 - compute word alignments for aligned sentences (Giza++) word alignments Vince He - a fragment pair is a sequence of tells asks them them aligned word pairs to to give run him a - do smoothing & different heuristics a blood blood test to determine fragment boundaries test to (-> minimal enclosing chunks) for check heightened for mercury mercury. - discard trivial fragments levels.

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 12 VP/PP Fragment Extraction

Vince He tells asks them them to to give run him - two types of fragments: a a blood blood test - phrases with a verb & test to for check same syntactic category heightened for mercury mercury. - prepositional phrases levels. give to - discard complete sentences him run a a and trivial fragments blood blood test test for to heightened check mercury for levels mercury.

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 13 VP/PP Fragment Extraction

Vince He tells asks them them to to give run him - two types of fragments: a a blood blood test - phrases with a verb & test VP to VP for check same syntactic category heightened for mercury mercury. - prepositional phrases levels. give to - discard complete sentences him run a a and trivial fragments blood blood test test for to heightened check mercury for levels mercury.

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 13 VP/PP Fragment Extraction

Vince He tells asks them them to to give run him - two types of fragments: a a blood blood test - phrases with a verb & test VP to VP for check same syntactic category heightened for mercury mercury. - prepositional phrases levels. give to - discard complete sentences him run a a and trivial fragments blood blood test test for to heightened check PP mercury for levels mercury. PP

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 13 Outline

- Paraphrasing & Discourse Knowledge √ - System Overview √ - Evaluation

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 14 Evaluation: Sentence Matching

- Baselines to measure contribution of semantic similarity & MSA: - MSA with BLEU as score function - Clustering (no sequential information) with Vector Similarities - Clustering with BLEU scores

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 15 Evaluation: Sentence Matching

- Evaluation Set: from each baseline and the system, pick 400 pairs labelled as paraphrase; add 400 completely random pairs - 2 annotators label each pair as paraphrase, containment, related or unrelated - conflicts resolved by 3rd annotator - for the final evaluation, we divide the set into unrelated pairs and good matches

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 16 Evaluation: Sentence Matching

1,00

0,75

0,50

0,25

0 Precision Recall F-Score Accuracy

Random Cluster + Bleu Cluster+ Vector MSA + Bleu MSA + Vector

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 17 Evaluation: Sentence Matching

1,00

0,75

0,50

0,25

0 Precision Recall F-Score Accuracy

Random Cluster + Bleu Cluster+ Vector MSA + Bleu MSA + Vector

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 17 Evaluation: Sentence Matching

1,00

0,75

0,50

0,25

0 Precision Recall F-Score Accuracy

Random Cluster + Bleu Cluster+ Vector MSA + Bleu MSA + Vector

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 17 Evaluation: Sentence Matching

1,00

0,75

0,50

0,25

0 Precision Recall F-Score Accuracy

Random Cluster + Bleu Cluster+ Vector MSA + Bleu MSA + Vector

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 17 Evaluation: Fragment Extraction

- evaluation of 3 main configurations: Basic (=Alignments + Chunker), VP/PP (clauses + PPs), VP/PP + Coreference Resolution (preprocessing) - Gold Standard: 150 pairs per configuration (~same labeling scheme as for sentences) - Precision is evaluated against gold standard - Recall is hard to determine, we note productivity instead (= #fragments per sentence pair)

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 18 Evaluation: Fragment Extraction

1,00

0,75

0,50

0,25

0 Precision Productivity Basic VP/PP VP /PP + Coref

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 19 Evaluation: Fragment Extraction

1,00

0,75

0,50

0,25

0 Precision Productivity Basic VP/PP VP /PP + Coref

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 19 Evaluation: Fragment Extraction

1,00

0,75

0,50

0,25

0 Precision Productivity Basic VP/PP VP /PP + Coref

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 19 Evaluation: Fragment Extraction

1,00

0,75

0,50

0,25

0 Precision Productivity Basic VP/PP VP /PP + Coref

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 19 Influence of Discourse Information on Fragment Extraction

1,00

0,75

0,50

0,25

0 Precision Productivity Cluster + VP MSA + VP

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 20 Influence of Discourse Information on Fragment Extraction

1,00

0,75

0,50

0,25

0 Precision Productivity Cluster + VP MSA + VP

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 20 Influence of Discourse Information on Fragment Extraction

1,00

0,75 30x more good 0,50 fragment pairs 0,25 per sentence pair

0 Precision Productivity Cluster + VP MSA + VP

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 20 Conclusion

- Discourse Knowledge for Paraphrase Extraction - a new, highly parallel corpus - Multiple Sequence Alignment for sentence matching - (grammatical) paraphrase fragments - discourse information gives big advantages in all processing stages

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 21 Future Work

- use MSA with clauses instead of sentences - with a temporal classifier as preprocessing, use arbitrary comparable corpora - align actual discourse trees (e.g. in RST or SDRT style)

Dataset in supplementary material: http://www.aclweb.org/supplementals/D/D12/D12-1084.Attachment.zip

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 22 Thanks! Questions?

Michaela Regneri & Rui Wang Using Discourse Information for Paraphrase Extraction 23