Generating Informative Conclusions for Argumentative Texts
Total Page:16
File Type:pdf, Size:1020Kb
Generating Informative Conclusions for Argumentative Texts Shahbaz Syed y Khalid Al-Khatib y Milad Alshomary z Henning Wachsmuth z Martin Potthast y yLeipzig University zPaderborn University <[email protected]> Abstract increases epinephrine (adrenaline) levels, a fight- The purpose of an argumentative text is to sup- or-flight hormone preparing the body for physical port a certain conclusion. Yet, they are often exertion. With free body fat acids as fuel, on aver- omitted, expecting readers to infer them rather. age, 12% higher performance is attainable.” While appropriate when reading an individual text, this rhetorical device limits accessibility Consider further these alternative conclusions: when browsing many texts (e.g., on a search 1. Caffeine is good. engine or on social media). In these scenarios, an explicit conclusion makes for a good candi- 2. Caffeine improves physical performance. date summary of an argumentative text. This is especially true if the conclusion is informative, The first conclusion conveys a pro stance towards emphasizing specific concepts from the text. the target, caffeine. The second, conveys a pro With this paper we introduce the task of gen- stance towards caffeine, too, but it also emphasizes erating informative conclusions: First, Webis- a specific concept (“physical performance”). The ConcluGen-21 is compiled, a large-scale cor- former conclusion is generic, only indicating the pus of 136,996 samples of argumentative texts stance, while the latter is informative; a distinction and their conclusions. Second, two paradigms also made in text summarization (Section3). 3 for conclusion generation are investigated; one Argumentative texts include short arguments, extractive, the other abstractive in nature. The latter exploits argumentative knowledge that such as forum posts and reviews, as well as long- augment the data via control codes and finetun- form texts, such as essays, blogs, and editorials. ing the BART model on several subsets of the Most of these typically have an intended conclu- corpus. Third, insights are provided into the sion of which the authors seek to persuade their suitability of our corpus for the task, the differ- readers.4 While the conclusion may be already ences between the two generation paradigms, implied in a given text, authors often choose not the trade-off between informativeness and con- to explicitly provide one, either for rhetorical rea- ciseness, and the impact of encoding argumen- tative knowledge. The corpus, code, and the sons (Habernal and Gurevych, 2015; Al-Khatib trained models are publicly available.1 et al., 2016), or to encourage critical thinking (Mar- tin et al., 2003). However, when browsing many 1 Introduction argumentative texts (e.g., via a search engine or on A conclusion of an argument is a statement that con- a social media timeline), having an explicit conclu- veys a stance towards a specific target (Bar-Haim sion helps human readers (and by extension also et al., 2017; Alshomary et al., 2020b). Drawing machines) to quickly process the texts. conclusions is an integral part of argumentation, In this paper, we introduce the task of gener- but often various conclusions may be drawn from ating informative conclusions for argumentative a set of premises. Consider the following argumen- texts, and take the first steps with four key con- tative text on caffeine adapted from the web:2 tributions: (1) Adaptation of the notion of infor- “Caffeine stimulates the nervous system, sig- mativeness from text summarization as a desired naling fat cells to break down body fat. It also 3Other works on argumentation use the term specificity to express a similar idea (Durmus et al., 2019; Ke et al., 2019). 1https://github.com/webis-de/ACL-21 4An exception is an argumentative text dedicated to deliber- 2https://www.healthline.com/nutrition/top-13-evidence- ation, which merely surveys the argument landscape on a based-health-benefits-of-coffee given topic without trying to influence the reader’s opinion. 3482 Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3482–3493 August 1–6, 2021. ©2021 Association for Computational Linguistics property of a conclusion besides stating a target and models (Sutskever et al., 2014; Bahdanau et al., the stance towards it. (2) Compilation of Webis- 2015) for summarizing movie reviews and debate ConcluGen-21, a corpus of 136,996 pairs of argu- portal arguments from idebate.org. Several argu- mentative texts and associated conclusions, creat- ment mining approaches have also been applied to ing the first large-scale ground truth for conclusion identify the main claim from arguments (Petasis generation. (3) Modeling conclusion generation and Karkaletsis, 2016; Daxenberger et al., 2017). as an end-to-end task by finetuning a pretrained Recently, Alshomary et al.(2020a) proposed a sequence-to-sequence model, and augmenting the graph-based model using PageRank (Page et al., corpus with three types of argumentative knowl- 1999) that extracts the argument’s conclusion and edge: topic, target, and aspect. (4) Extensive quan- the main supporting reason as an extractive snippet. titative and qualitative (crowdsourced) evaluation This model is the core of our extractive summariza- of both the quality of our dataset and the effective- tion approach (Section5). ness of two paradigms for conclusion generation, A key difference between conclusion genera- namely extractive and abstractive approaches. tion and general text summarization is the con- We present three key findings: (a) Finetuning straint that a conclusion must have a clear stance pretrained language models on our dataset shows towards a certain topic. A similar constraint applies strong in-domain performance compared to the ex- to high-quality summaries of long-form argumen- tractive approach. (b) Qualitative evaluation shows tative texts such as editorials (Syed et al., 2020), that the extractive approach generates more infor- where the persuasiveness of the editorial should be mative conclusions, demonstrating a trade-off be- preserved alongside its thesis. Therefore, existing tween conciseness and informativeness. (c) Encod- summarization corpora (although large-scale) are ing argumentative knowledge guides the finetun- unsuitable for studying conclusion generation. A ing towards generating argumentative sentences; majority of them contain only non-argumentative however, more sophisticated encoding techniques texts (e.g., news reports) which are more suitable to than just using the conventional control codes are general-purpose summarization (Kryscinski et al., needed to generate informative conclusions. 2019). Moreover, intrinsic evaluation of summa- rization corpora has revealed a lower-quality and/or 2 Related Work inconsistent ground-truth, rendering them partially Our work complements and builds on that of Al- unfit for their intended purpose (Bommasani and shomary et al.(2020b), who introduced a concep- Cardie, 2020). To fill this gap, we compile Webis- tual model for conclusion generation, outlining a ConcluGen-21, a large-scale corpus of argumenta- three-step process: inferring the conclusion’s tar- tive texts and their conclusions on diverse topics. get from the argument’s premises, inferring the Pre-trained language models have significantly author’s stance towards this target, and generating advanced the state-of-the-art in neural text summa- the conclusion based on these two pieces of infor- rization (Liu and Lapata, 2019; Zhang et al., 2019a; mation. But Alshomary et al. focused only on the Rothe et al., 2020; Huang et al., 2020). However, first step of target inference, whereas we model they have been applied to the domain of argumenta- conclusion generation as an end-to-end task. tion only recently, specifically for argument gener- Conclusion generation can be viewed as a com- ation. Gretz et al.(2020) proposed a pipeline based plementary task to summarizing argumentative on GPT-2 (Radford et al., 2019) for generating co- texts. Previous approaches to the summarization herent claims for a given debate topic. A more of such texts have been primarily extractive. Egan controlled approach for argument generation was et al.(2016) proposed summarizing online discus- developed by Schiller et al.(2020), which performs sions via “point” extraction, where a point is a verb argument generation with fine-grained control of and its syntactic arguments. Similarly, Bar-Haim topic, aspect (core reasoning), and stance. Con- et al.(2020) compiled the ArgKP corpus (which we clusion generation can be viewed as supplement- also sample from in Section4) comprised of argu- ing argument generation. Ideally, given a conclu- ments for a given topic mapped to key points, com- sion, an argument can be generated constrained by posing a summary from a large collection of rele- the conclusion’s target and stance. To the best of vant arguments. Wang and Ling(2016) proposed a our knowledge, studies investigating pretrained lan- data-driven approach using sequence-to-sequence guage models for end-to-end conclusion generation 3483 do not exist. Besides providing a suitable corpus, towards a topic (e.g., “Caffeine is good.”), informa- we analyze the impact of encoding argumentative tive conclusions also discuss specific concepts from knowledge in pretrained language models and as- (or implied by) the argumentative text (e.g., “Caf- sess the popular method of control codes (Keskar feine improves physical performance.”). Concepts et al., 2019; Cachola et al., 2020) for encoding