Generating Counter Narratives against Online Hate Speech: Data and Strategies Serra Sinem Tekiroglu˘ 1, Yi-Ling Chung1,2, and Marco Guerini1 1Fondazione Bruno Kessler, Via Sommarive 18, Povo, Trento, Italy
[email protected],
[email protected],
[email protected] 2University of Trento, Italy Abstract the conversations (Bielefeldt et al., 2011; Jurgens et al., 2019). In this line of action, some Non- Recently research has started focusing on avoiding undesired effects that come with Govermental Organizations (NGOs) train operators content moderation, such as censorship and to intervene in online hateful conversations by writ- overblocking, when dealing with hatred on- ing counter-narratives. A Counter-Narrative (CN) line. The core idea is to directly intervene in is a non-aggressive response that offers feedback the discussion with textual responses that are through fact-bound arguments and is considered as meant to counter the hate content and prevent the most effective approach to withstand hate mes- it from further spreading. Accordingly, au- sages (Benesch, 2014; Schieb and Preuss, 2016). tomation strategies, such as natural language generation, are beginning to be investigated. To be effective, a CN should follow guidelines simi- 1 Still, they suffer from the lack of sufficient lar to those in ‘Get the Trolls Out’ project , in order amount of quality data and tend to produce to avoid escalating the hatred in the discussion. generic/repetitive responses. Being aware of Still, manual intervention against hate speech the aforementioned limitations, we present a is not scalable. Therefore, data-driven NLG ap- study on how to collect responses to hate ef- proaches are beginning to be investigated to assist fectively, employing large scale unsupervised language models such as GPT-2 for the gen- NGO operators in writing CNs.