Automatic Detection of Contradictions in Texts
Total Page:16
File Type:pdf, Size:1020Kb
Automatic Detection of Contradictions in Texts Inaugural-Dissertation zur Erlangung des Doktorgrades der Philosophie des Fachbereiches 05 – Sprache, Literatur, Kultur der Justus-Liebig-Universität Gießen vorgelegt von Natali Karlova-Bourbonus, M.A. Aus Frankfurt am Main 2018 Vorsitzender: Herr Prof. Dr. Thomas Möbius Erstgutachter: Herr Prof. Dr. Henning Lobin Zweitgutachter: Herr Prof. Dr. Helmut Feilke For my little prince Philipp Acknowledgments Five years are now passed since I began my doctoral thesis on contradictions in news texts and at this point, I would like to thank everybody who helped me in finalizing it. First of all, I would like to express my special thanks of gratitude to my Doktorvater, Prof. Dr. Henning Lobin, who gave me a priceless opportunity to collect new experience in teach- ing and researching and to learn how exciting natural language processing can be. Dear Henning, thank you for supporting me the whole, often thorny, way to the completed mon- ograph from the first half-baked ideas till the last dot on the paper. I would also like to express my special thanks of gratitude to my second supervisor, Prof. Dr. Helmut Feilke. Participation in his GCSC-Kolloquium in the year 2014 led to a long- desired breakthrough in my research. I am grateful for his sharing of precious ideas with me, which laid the groundwork for my research. While reading the present monograph you will come across the statement several times that finding contradictions is a challenging task for a human. I would like to thank all the students at the University of Giessen who did not let this challenge discourage them and who, in the years 2015 and 2016, took part on my surveys conducted within the framework of the present study. Their patience, curiosity, and ability to explore latent things in the texts are admirable. Last but not least, I would like to thank the most important people in my life – my family. My parents, who always let me go my way, who never judged me for my decisions, and patiently and gratuitous supported me in seemingly inextricable moments. My husband Nico, who inspired and motivated me to make possible what, in the beginning, seemed to be impossi- ble. And my brother Denis for the exciting conversations on the processing of natural lan- guages. Table of Contents I Table of Contents Page List of Abbreviations ...................................................................................................... V List of Figures .............................................................................................................. VIII List of Tables.................................................................................................................. IX 1 Introduction ............................................................................................................. 11 1.1 Statement of Problem and Motivation .............................................................. 11 1.2 Subject of the Study ......................................................................................... 16 1.3 Research Questions and Objectives ................................................................ 17 1.4 Structure of the Thesis ..................................................................................... 18 1.5 How to Read This Thesis ................................................................................. 20 2 State of the Art ........................................................................................................ 21 2.1 Methods and Systems ..................................................................................... 21 2.2 Corpora of Contradictions ................................................................................ 32 2.2.1 FraCas Inference Data Suite ................................................................ 32 2.2.2 RTE Datasets and Their Modifications .................................................. 34 2.2.3 Stanford Corpus of Real-Life Contradictions ......................................... 37 2.2.4 SNLI Corpus ......................................................................................... 38 2.3 Summary ........................................................................................................ 39 3 Contradiction in Logic and Language ................................................................... 41 3.1 Contradiction as Concept of Logic ................................................................... 41 3.1.1 Contradiction in Traditional (Aristotelian) Logic ..................................... 41 3.1.2 Contradiction and Contrariety ............................................................... 43 3.1.3 Contradiction and Quantification ........................................................... 43 3.1.4 Related Terms ...................................................................................... 45 3.2 Negation in Natural Languages ........................................................................ 46 3.2.1 Typology of Negation ............................................................................ 46 3.2.2 Negation and Word Order ..................................................................... 52 3.2.3 Scope of Negation ................................................................................ 52 3.2.4 Double and Multiple Negation ............................................................... 55 3.3 Problematic Issues about Contradiction ........................................................... 59 3.3.1 Contradiction and Presupposition ......................................................... 59 3.3.2 Contradiction and Modality ................................................................... 61 3.3.3 Contradiction and Vagueness ............................................................... 64 3.3.4 The Concept of Fake Contradiction ...................................................... 66 Table of Contents II 3.4 Classification of Contradictions ........................................................................ 67 3.4.1 Classification of Svintsov ...................................................................... 67 3.4.2 Classification of Mučnik ........................................................................ 69 3.4.3 Typology according to Contradictory Element ....................................... 71 3.4.3.1 In Educational Psychology ...................................................... 71 3.4.3.2 In Computational Linguistics ................................................... 73 3.5 Causes and Functions of Contradictions .......................................................... 76 3.5.1 Causes ................................................................................................. 76 3.5.2 Functions .............................................................................................. 79 3.6 Summary ........................................................................................................ 81 4 The Characteristics of News Texts ........................................................................ 83 4.1 Introductory Notions ......................................................................................... 83 4.1.1 News as Text Genre ............................................................................. 83 4.1.2 Online Newspaper vs. Printed Newspaper............................................ 86 4.1.3 Values of Selection and Production of News Articles ............................ 88 4.2 Structure and Elements of News Articles ......................................................... 91 4.3 Language of News Article ................................................................................ 95 4.3.1 Reported Speech.................................................................................. 95 4.3.1.1 News as an “embedded talk” .................................................. 95 4.3.1.2 Types of Reported Speech ..................................................... 96 4.3.1.3 Reporting Expressions ............................................................ 98 4.3.2 News Actors Labeling ........................................................................... 99 4.3.3 Event Categories .................................................................................. 99 4.3.4 Time and Place Mentions ................................................................... 100 4.3.5 The Use of Numbers and Figures ....................................................... 101 4.3.6 Other Characteristics .......................................................................... 102 4.4 Summary ...................................................................................................... 102 5 Typology Construction: Types of Contradictions in News Texts ...................... 104 5.1 Compilation of News Text Contradiction Corpus ............................................ 104 5.1.1 Data Collection ................................................................................... 104 5.1.1.1 Collection of News Texts ....................................................... 104 5.1.1.2 Survey 1: Finding the Contradictions..................................... 106 5.1.1.3 Results and Evaluation ......................................................... 107 5.1.2 Data Validation and Filtering ............................................................... 108 5.1.2.1 Survey 2: Contradiction or Not .............................................. 108 5.1.2.2 Results and Evaluation