Paraphrase and Textual Entailment Recognition And

PARAPHRASE AND TEXTUAL ENTAILMENT RECOGNITION AND GENERATION Prodromos Malakasiotis PH.D. THESIS DEPARTMENT OF INFORMATICS ATHENS UNIVERSITY OF ECONOMICS AND BUSINESS 2011 Abstract Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirec- tional textual entailment and methods from the two areas are often very similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. In this thesis, we focus on paraphrase and textual entailment recognition, as well as paraphrase generation. We propose three paraphrase and textual entailment recognition methods, experimentally evaluated on existing benchmarks. The key idea is that by capturing similarities at various abstractions of the inputs, we can recognize paraphrases and textual entailment reasonably well. Additionally, we exploit WordNet and use features that operate on the syntactic level of the language expressions. The best of our three recognition methods achieves state of the art results on the widely used MSR paraphrasing corpus, but the simplest of our methods is also a very competitive baseline. On textual entailment datasets, our methods achieve worse results. Neverthe- less, they perform reasonably well, despite being simpler than several other proposed methods; therefore, they can be considered as competitive baselines for future work. ii ABSTRACT iii On the generation side, we propose a novel approach to paraphrasing sentences that produces candidate paraphrases by applying paraphrasing rules and then ranks the candidates. Furthermore, we propose a new methodology to evaluate the ranking com- ponents of generate-and-rank paraphrase generators. A new paraphrasing dataset was also created for evaluations of this kind, which was then used to show that the perfor- mance of our method’s ranker improves when features from our paraphrase recognizer are also included. Furthermore, we show that our method compares well against a state of the art paraphrase generator, and we present evidence that our method might benefit from additional training data. Finally, we present three in vitro studies of how paraphrase generation can be used in question answering, natural language generation, and Web advertisements. Acknowledgements I would like to thank my supervisor, Ion Androutsopoulos, as well as my co-supervisors, Theodore Kalampoukis and Michalis Vazirgiannis, for their guidance throughout my studies at Athens University of Economics and Business (AUEB). The work of this thesis was supported by a scholarship from the Greek PENED 2003 programme, which was co-funded by the European Union (80%) and the Greek General Secretariat for Research and Technology (20%). I would also like to thank all the members of the Natural Language Processing Group of AUEB’s Department of Informatics, especially Dimitris Galanis and Makis Lampouras for the long discussions we had concerning the problems encountered dur- ing my research. A big thanks goes to my family and all my friends for their support and patience all these years. Especially to Matina Charami, who was always there when I needed her, to support me and help me go through many difficulties. Last but not least, I would like to thank my three dogs, Liza, Dahlia, and Phantom, for the moments of joy they have given me, lifting up my spirit when needed. iv Contents Abstract ii Acknowledgements iv Contents v List of Tables viii List of Figures x 1 Introduction 1 1.1 Paraphrasing and textual entailment . 1 1.2 Contribution of this thesis . 3 1.2.1 Extensive survey and classification of previous methods . 3 1.2.2 Paraphrase and textual entailment recognition . 4 1.2.3 Paraphrase generation . 5 1.2.4 Applications of paraphrase generation . 6 1.3 Outline of the remainder of this thesis . 6 2 Background and related work 7 2.1 Introduction . 7 2.2 Paraphrase and textual entailment recognition . 16 2.3 Paraphrase and textual entailment generation . 27 v CONTENTS vi 2.4 Paraphrase and textual entailment extraction . 37 2.5 Conclusions . 51 3 Paraphrase and Textual Entailment Recognition 53 3.1 Introduction . 53 3.2 Datasets . 55 3.3 Evaluation measures . 59 3.4 Our recognition methods . 60 3.4.1 Maximum Entropy classifier . 61 3.4.2 String similarity measures . 63 3.4.3 Combining similarity measures . 65 3.4.4 The need for synonyms and grammatical relations . 70 3.4.5 Feature selection . 74 3.4.6 Baselines . 76 3.5 Comparing to other methods . 78 3.6 Conclusions and contribution of this chapter . 82 4 Paraphrase Generation 84 4.1 Introduction . 84 4.2 Generating candidate paraphrases . 87 4.3 A dataset of sources and candidates . 89 4.4 Quick introduction to SVMs and SVRs .................. 93 4.4.1 Binary classification with SVMs . 94 4.4.2 Ranking with SVRs........................ 96 4.5 Ranking candidate paraphrases . 97 4.6 Comparing to the state of the art . 102 4.7 Learning rate . 106 4.8 Conclusions and contribution of this chapter . 107 CONTENTS vii 5 Applications of Paraphrase Generation 109 5.1 Introduction . 109 5.2 Paraphrasing for question answering . 111 5.2.1 Architecture of QA systems . 111 5.2.2 Paraphrasing questions . 113 5.3 Paraphrasing for natural language generation . 115 5.3.1 Architecture of NLG systems . 115 5.3.2 Paraphrasing sentence templates . 117 5.4 Paraphrasing advertisements . 119 5.5 Conclusions and contribution of this chapter . 123 6 Conclusions 125 6.1 Summary and contribution of this thesis . 125 6.1.1 Extensive survey and classification of previous methods . 125 6.1.2 Paraphrase and textual entailment recognition . 126 6.1.3 Paraphrase generation . 127 6.1.4 Applications of paraphrase generation . 128 6.2 Future work . 128 A Recognition results on all datasets 130 Bibliography 136 List of Tables 3.1 BASE1 results per corpus. 77 3.2 BASE2 results per corpus. 78 3.3 Paraphrase recognition results on the MSR corpus. 79 3.4 Textual entailment recognition results on the RTE-1 corpus. 79 3.5 Textual entailment recognition results on the RTE-2 corpus. 79 3.6 Textual entailment recognition results (for two classes) on the RTE-3 corpus. 80 3.7 Textual entailment recognition results (for three classes) on the RTE-3 corpus. 80 3.8 Textual entailment recognition results (for two classes) on the RTE-4 corpus. 80 3.9 Textual entailment recognition results (for three classes) on the RTE-4 corpus. 81 3.10 Textual entailment recognition results (for two classes) on the RTE-5 corpus. 81 3.11 Textual entailment recognition results (for three classes) on the RTE-5 corpus. 81 4.1 Evaluation guidelines. 91 4.2 Example calculation of k statistic . 92 viii LIST OF TABLES ix 4.3 Inter-annotator agreement. 93 4.4 Comparison of SVR-REC and ZHAO-ENG. 105 A.1 INIT results per corpus. 133 features are used. 131 A.2 INIT+WN results per corpus. 133 features are used. 131 A.3 INIT+WN+DEP results per corpus. 136 features are used. 131 A.4 FHC results per corpus. 132 A.5 FBS results per corpus. 132 A.6 BHC results per corpus. 132 A.7 BBS results per corpus. 133 A.8 Results on MSR corpus. 133 A.9 Results on RTE-1 dataset. 133 A.10 Results on RTE-2 dataset. 134 A.11 Results on RTE-3 dataset. 134 A.12 Results on RTE-4 dataset. 134 A.13 Results on RTE-5 dataset. 135 List of Figures 2.1 Two sentences that are very similar when viewed at the level of dependency trees. 18 2.2 An example of how dependency trees may make it easier to match a short sentence (subtree inside the dashed line) to a part of a longer one. 23 2.3 Paraphrase and textual entailment recognition via supervised machine learning. 24 2.4 Generating paraphrases of “X wrote Y” by bootstrapping. 36 2.5 Dependency trees of sentences (2.54) and (2.57). 39 2.6 Word lattices obtained from sentence clusters in Barzilay and Lee’s method. 42 2.7 Merging parse trees of aligned sentences in Pang et al.’s method. 43 2.8 Finite state automaton produced by Pang et al.’s method. 44 3.1 INIT error rate learning curves on the MSR corpus. 68 3.2 INIT error rate learning curves on the RTE-3 corpus for 2 classes. 68 3.3 INIT error rate learning curves on the RTE-3 corpus for 3 classes. 69 3.4 Grammatical relations of example 3.22–3.23. 72 3.5 Grammatical relations of example 3.24–3.25. 73 3.6 INIT+WN+DEP and INIT error rate learning curves on the MSR corpus. 73 3.7 BBS error rate learning curves on the RTE-3 corpus for 2 classes. 75 x LIST OF FIGURES xi 3.8 FHC error rate learning curves on the RTE-3 corpus for 3 classes. 76 4.1 Distribution of overall paraphrase quality scores in the evaluation dataset. 90 4.2 Comparison of SVR-REC and SVR-BASE. 100 4.3 Grammaticality of SVR-REC and ZHAO-ENG. 104 4.4 Meaning preserv. of SVR-REC and ZHAO-ENG. 104 4.5 Diversity of SVR-REC and ZHAO-ENG. 105 4.6 Learning curves of SVR-REC-BIN and ME-REC. 107 5.1 A typical QA system architecture.

Load more