Arxiv:2011.03588V1 [Cs.CL] 6 Nov 2020
Total Page:16
File Type:pdf, Size:1020Kb
Hostility Detection Dataset in Hindi Mohit Bhardwajy, Md Shad Akhtary, Asif Ekbalz, Amitava Das?, Tanmoy Chakrabortyy yIIIT Delhi, India. zIIT Patna, India. ?Wipro Research, India. fmohit19014,tanmoy,[email protected], [email protected], [email protected] Abstract Despite Hindi being the third most spoken language in the world, and a significant presence of Hindi content on social In this paper, we present a novel hostility detection dataset in Hindi language. We collect and manually annotate ∼ 8200 media platforms, to our surprise, we were not able to find online posts. The annotated dataset covers four hostility di- any significant dataset on fake news or hate speech detec- mensions: fake news, hate speech, offensive, and defamation tion in Hindi. A survey of the literature suggest a few works posts, along with a non-hostile label. The hostile posts are related to hostile post detection in Hindi, such as (Kar et al. also considered for multi-label tags due to a significant over- 2020; Jha et al. 2020; Safi Samghabadi et al. 2020); however, lap among the hostile classes. We release this dataset as part there are two basic issues with these works - either the num- of the CONSTRAINT-2021 shared task on hostile post detec- ber of samples in the dataset are not adequate or they cater tion. to a specific dimension of the hostility only. In this paper, we present our manually annotated dataset for hostile posts 1 Introduction detection in Hindi. We collect more than ∼ 8200 online so- The COVID-19 pandemic has changed our lives forever, cial media posts and annotate them as hostile and non-hostile both online and offline. As the physical world went into posts. Furthermore, we identify four hostility dimensions for lockdown, the online world came closer than ever. Since each hostile post as fake, defamation, hate, and offensive. people are confined to their homes, they are spending way Though some of these hostile dimensions sound similar at more time on social media, chat rooms, communication the abstract-level (e.g., hate and offensive), their definitions apps, and gaming servers, which can have serious impli- are different, and we define them below following (Mathur cations on the mental health on an individual as well. Ac- et al. 2018a) and (Davidson et al. 2017). cording to a recent survey1, there has been 900% increase • Fake News: A claim or information that is verified to be in hate speech towards China and it’s people on Twitter, not true. We have included tweets belonging to click bait 200% increase in traffic on sites that promote hate speech and satire/parody categories as fake news as well. against Asians, 70% increase in hate speech among teen and kids online, and toxicity levels in gaming community has in- • Hate Speech: A post targeting a specific group of peo- creased by 40% as well. Similar trends have been observed ple based on their ethnicity, religious beliefs, geographical in non-English languages as well, and since the percent- belonging, race, etc., with malicious intentions of spread- age of non-English tweets in India2 have risen up to 50%, ing hate or encouraging violence. early detection of hostile texts in low resource languages like Hindi is of paramount importance as well. • Offensive: A post containing profanity, impolite, rude, or A significant number of online social media users post vulgar language to insult a targeted individual or group. arXiv:2011.03588v1 [cs.CL] 6 Nov 2020 harmful contents without realising that they are crossing • Defamation: A mis-information regarding an individual the line defined by the freedom-of-speech. As evident as it or group, which is destroying their reputation publicly. sounds, fake news, hate speeches, offensive remarks, etc., are extremely harmful for any civilised society, and ask for • Non-Hostile: A post with no hostility. adequate and appropriate guidelines to prevent/curb such ac- tivities. The foremost task in neutralising such activities is The dataset development is part of the CONSTRAINT- the hostile post detection, and many works have been car- 2021 shared task (con 2021). The CONSTRAINT-2021 ried out to address the issue in English (Waseem and Hovy workshop emphasizes the hostility detection on three ma- 2016; Waseem et al. 2017; Nobata et al. 2016). jor points, i.e., low-resource regional languages, detection in emergency situations, and early detection. Currently, the Copyright © 2021, Association for the Advancement of Artificial train and validation set is available with the shared task3, and Intelligence (www.aaai.org). All rights reserved. 1https://l1ght.com/Toxicity during coronavirus Report- we will release the test set at the end of the workshop. L1ght.pdf 2https://bit.ly/3mnXoKM 3https://constraint-shared-task-2021.github.io/ (a) Hostile Word Cloud. (b) Non-Hostile Word Cloud. Figure 1: Word clouds. 2 Related work Figure 2: Venn Diagram of Multi-Label Hindi hostility Waseem & Hovy (Waseem and Hovy 2016) considered an- dataset. Notations: [F − Fake], [O− Offensive], [H− Hate], notations for hate speech, but they did not consider other [D− Defamation], [NH− Non-hostile]. dimensions of hostile text like offensive or bullying. In an- other work, Waseem et al. (Waseem et al. 2017) discuss the user agreement and consensus in annotating bullying, ha- therefore, we adopt the idea of multi-label tagging for each rassment, offensive, and hate speeches. They showed that post. Figure 2 shows the class-wise overlaps among hostile it is easy to identify the victim of the bully quite convinc- dimensions in form of a Venn diagram4. Although it reveals ingly by the annotators, whereas there is very low consen- the relationship among for the majority cases, it is inade- sus in annotations of harassment, offensive and hate speech. quate to show the intersection between fake and hate class This may be partially because hostility can be generalized, in 2D. directed, implicit or explicit. Wijesiriwardene et al. (Wije- Some of the examples from the dataset are presented in siriwardene et al. 2020) provide a dataset of toxicity (ha- Table 1. For example, the first sentence is a factually verified rassment, offensive language, hate speech) on Twitter in En- fake news, and lies in the offensive category due to the use glish. of vulgar language, but it has an implicit hatred against reli- An example of implicit hostility in Hindi is to call some- gious minority, which might even lead to violence amongst one ‘meetha’, which literally means sweet in Hindi; how- two communities in the worst scenarios. Similarly in the ever, the intended meaning in a hostile post could be of fourth example, derogatory and vulgar language is used to- ‘fa$$ot’ - a derogatory term towards the LGBT community. wards a famous advocate alongside spreading misinforma- Among other notable works on hostility detection, David- tion to defame him. son et al. (Davidson et al. 2017) studied the hate speech de- tection for English. They argued that some words may re- Data Collection flect hate in one region; however, the same word can be used as a frequent slang term. For example, in English, the term We collect ∼ 8200 hostile and non-hostile texts from various ‘dog’ does not reveal any hate or offense, but it (ku##a) is social media platforms like Twitter, Facebook, WhatsApp, commonly referred to as a derogatory term in Hindi. etc. We follow different strategies to collect data for each Considering the severity of the problem, some effort category. has been made for non-English languages as well, such • For fake news collection, we refer to some of India’s as Arabic (Haddad et al. 2020), Bengali (Hossain et al. top most fact checking websites like BoomLive 5, Dainik 2020), Hindi (Jha et al. 2020), etc. Samghabadi et al. Bhaskar 6, etc, and read numerous articles in Hindi. This (Safi Samghabadi et al. 2020) addressed the problem of process helps us identify the topics of the fake news. Sub- aggression and misogyny detection in English, Hindi, and sequently, we compile a topic-wise keyword list for each Bengali, whereas, Jha et al. (Jha et al. 2020) worked on fake news. Next, we curate online social media platforms the keyword-based (swear words) offensive text detection in such as Twitter, Instagram, etc., for the collection of posts. Hindi. There are also a few attempts at Hindi-English code- mixed hate speech (Bohra et al. 2018) and offensive post • For hate speech collection, at first, we target the tweets en- (Mathur et al. 2018b) detection. Recently, Kar et al. (Kar couraging violence against minorities based on their race, et al. 2020) developed a multi-lingual COVID-19 rumour religious beliefs, etc. Following this process, we analyse detection dataset in English, Hindi, and Bangla; however, the timelines of users with significant hate-related posts. the dataset is significantly small and is limited to COVID- Additionally, we also analyse users who liked or com- related text only. mented in support of the hate speech and scan their time- lines for additional hate-related posts as well. 3 Data Development 4https://www.meta-chart.com/venn During the development of the dataset, we observe that some 5https://hindi.boomlive.in/fake-news of the posts have overlap among the hostility dimensions; 6https://www.bhaskar.com/no-fake-news/ Table 1: A few annotated examples from the dataset. • For offensive posts, we employ the list of top swear words Hotile posts Non-hostile used in Hindi language as determined by Jha et al. (Jha Fake Hate Offense Defame Total∗ 7 et al. 2020). For each swear word, we query Twitter API Train 1144 792 742 564 2678 3050 to extract (offensive) tweets.