Emotion Classification in a Resource Constrained Language Using Transformer-Based Approach
Total Page:16
File Type:pdf, Size:1020Kb
Emotion Classification in a Resource Constrained Language Using Transformer-based Approach Avishek Das, Omar Sharif, Mohammed Moshiul Hoque and Iqbal H. Sarker Department of Computer Science and Engineering Chittagong University of Engineering & Technology Chittagong-4349, Bangladesh [email protected],{omar.sharif,moshiul_240,iqbal}@cuet.ac.bd Abstract The availability of vast amounts of on- Although research on emotion classifica- line data and the advancement of computa- tion has significantly progressed in high- tional processes have accelerated the develop- resource languages, it is still infancy for ment of emotion classification research in high- resource-constrained languages like Ben- resource languages such as English, Arabic, gali. However, unavailability of necessary Chinese, and French (Plaza del Arco et al., language processing tools and deficiency 2020). However, there is no notable progress of benchmark corpora makes the emotion in low resource languages such as Bengali, classification task in Bengali more challeng- Tamil and Turkey. The proliferation of the ing and complicated. This work proposes a transformer-based technique to classify Internet and digital technology usage produces the Bengali text into one of the six ba- enormous textual data in the Bengali language. sic emotions: anger, fear, disgust, sadness, The analysis of these massive amounts of data joy, and surprise. A Bengali emotion cor- to extract underlying emotions is a challenging pus consists of 6243 texts is developed for research issue in the realm of Bengali language the classification task. Experimentation processing (BLP). The complexity arises due carried out using various machine learning to various limitations, such as the lack of (LR, RF, MNB, SVM), deep neural net- works (CNN, BiLSTM, CNN+BiLSTM) BLP tools, scarcity of benchmark corpus, com- and transformer (Bangla-BERT, m-BERT, plicated language structure, and limited re- XLM-R) based approaches. Experimental sources. By considering the constraints of emo- outcomes indicate that XLM-R outdoes all tion classification in the Bengali language, this other techniques by achieving the highest work aims to contribute to the following: weighted f1-score of 69:73% on the test data. The dataset is publicly available • Develop a Bengali emotion corpus consist- at https://github.com/omar-sharif03/ ing of 6243 text documents with manual NAACL-SRW-2021. annotation to classify each text into one 1 Introduction of six emotion classes: anger, disgust, fear, joy, sadness, surprise. Classification of emotion in the text signifies the task of automatically attributing an emo- • Investigate the performance of various tion category to a textual document selected ML, DNN and transformer-based ap- from a set of predetermined categories. With proaches on the corpus. the growing number of users in virtual plat- forms generating online contents steadily as a • Proposed a benchmark system to classify fast-paced, interpreting emotion or sentiment emotion in Bengali text with the experi- in online content is vital for consumers, enter- mental validation on the corpus. prises, business leaders, and other parties con- cerned. Ekman (Ekman, 1993) defined six ba- 2 Related Work sic emotions: happiness, fear, anger, sadness, surprise, and disgust based on facial features. Substantial research activities have been car- These primary type of emotions can also be ried out on emotion analysis in high-resource extracted from the text expression (Alswaidan languages like English, Arabic, and Chinese and Menai, 2020). (Alswaidan and Menai, 2020). A multi-label 150 Proceedings of NAACL-HLT 2021: Student Research Workshop, pages 150–158 June 6–11, 2021. ©2021 Association for Computational Linguistics with multi-target emotion detection of Ara- 3.1 Data Collection and Preprocessing bic tweets accomplished using decision trees, Five human crawlers were assigned to accumu- random forest, and KNN, where random for- late data from various online/offline sources. est provided the highest f1-score of 82.6% They manually collected 6700 text documents (Alzu’bi et al., 2019). Lai et al. (2020) over three months (September 10, 2020 to De- proposed a graph convolution network archi- cember 11, 2020). The crawler accumulated tecture for emotion classification from Chi- data selectively, i.e., when a crawler finds a nese microblogs and their proposed system text that supports the definition of any of achieved an F-measure of 82.32%. Recently, the six emotion classes according to Ekman few works employed transformer-based model (1993), the content is collected, otherwise ig- (i.e., BERT) analyse emotion in texts. (Huang nored. Raw accumulated data needs following et al., 2019; Al-Omari et al., 2020) used a pre- pre-processing before the annotation: trained BERT for embedding purpose on top of LSTM/BiLSTM to get an improved f1-score • Removal of non-Bengali words, punctua- of 76.66% and 74.78% respectively. tion, emoticons and duplicate data. Although emotion analysis on limited re- • Discarding data less than three words to source languages like Bengali is in the pre- get an unerring emotional context. liminary stage, few studies have already been conducted on emotion analysis using ML and After pre-processing the corpus holds 6523 DNN methods. Irtiza Tripto and Eunus Ali text data. The processed texts are eligible for (2018) proposed an LSTM based approach manual annotation. The details of the prepro- to classify multi-label emotions from Bengali, cessing modules found in the link1. and English sentences. This system considered only YouTube comments and achieved 59.23% 3.2 Data Annotation and Quality accuracy. Another work on emotion classifi- Five postgraduate students working on BLP cation in Bengali text carried out by Azmin were assigned for initial annotation. To choose and Dhar (2019) concerning three emotional the initial label majority voting technique is labels (i.e., happy, sadness and anger). They applied (Magatti et al., 2009). Initial labels used Multinomial Naïve Bayes, which outper- were scrutinized by an expert who has several formed other algorithms with an accuracy of years of research expertise in BLP. The expert 78.6%. Pal and Karn (2020) developed a logis- corrected the labelling if any initial annotation tic regression-based technique to classify four is done inappropriately. The expert discarded emotions (joy, anger, sorrow, suspense) in Ben- 163 texts with neutral emotion and 117 texts gali text and achieved 73% accuracy. Das and with mixed emotions for the intelligibility of Bandyopadhyay (2009) conducted a study to this research. To minimize bias during anno- identify emotions in Bengali blog texts. Their tation, the expert finalized the labels through scheme attained 56.45% accuracy using the discussions and deliberations with the anno- conditional random field. Recent work used tators (Sharif and Hoque, 2021). We evalu- SVM to classify six raw emotions on 1200 Ben- ated the inter-annotator agreement to ensure gali texts which obtained 73% accuracy (Ru- the quality of the annotation using the coding posh and Hoque, 2019). reliability (Krippendorff, 2011) and Cohen’s kappa (Cohen, 1960) scores. An inter-coder reliability of 93.1% with Cohen’s Kappa score 3 BEmoC: Bengali Emotion Corpus of 0.91 reflects the quality of the corpus. 3.3 Data Statistics Due to the standard corpus unavailability, we developed a corpus (hereafter called ‘BEmoC’) The BEmoC contains a total of 6243 text doc- to classify emotion in Bengali text. The uments after the preprocessing and annotation development procedure is adopted from the process. Amount of data inclusion in BEmoC guidelines stated in (Dash and Ramamoorthy, 1https://github.com/omar-sharif03/ 2019). NAACL-SRW-2021/tree/main/Code%20Snippets 151 varies with the sources. For example, among classes seem to have almost similar number of online sources, Facebook contributed the high- texts in all length distributions. est amount (2796 texts) whereas YouTube (610 texts), blogs (483 texts), and news portals (270 texts) contributed a small amount. Of- fline sources contributed a total of 2084 texts, including storybooks (680 texts), novels (668 texts), and conversations (736 texts). Data partitioned into train set (4994 texts), valida- tion set (624 texts) and test set (625 texts) to evaluate the models. Table 1 represents the amount of data in each class according to the train-validation-test set. Class Train Validation Test Anger 621 67 71 Figure 1: Corpus distribution concerning number Disgust 1233 155 165 of texts vs length Fear 700 89 83 Joy 908 120 114 Sadness 942 129 119 For quantitative analysis, the Jaccard sim- Surprise 590 64 73 ilarity among the classes has been computed. We used 200 most frequent words from each Table 1: Number of instances in the train, valida- tion, and test sets emotion class, and the similarity values are reported in table 3. The Anger-Disgust and Joy-Surprise pairs hold the highest similarity Since the classifier models learn from the of 0.58 and 0.51, respectively. These scores in- training set instances to obtain more insights, dicate that more than 50% frequent words are we further analyzed this set. Table 2 shows common in these pair of classes. On the other several statistics of the training set. hand, the Joy-Fear pair has the least similarity index, which clarifies that this pair’s frequent Total Unique Avg. words Class words are more distinct than other classes. words words per text Anger 14914 5852 24.02 These similarity issues can substantially affect Disgust 27192 7212 22.35 the emotion classification task. Some sample Fear 14766 5072 21.09 Joy 20885 7346 23.40 instances of BEmoC are shown in Table 9 (Ap- Sadness 22727 7398 24.13 pendix B). Surprise 13833 5675 23.45 Total 114317 38555 - C1 C2 C3 C4 C5 C6 C1 1.00 0.58 0.40 0.43 0.45 0.47 Table 2: Statistics of the train set of BEmoC C2 - 1.00 0.41 0.45 0.47 0.44 C3 - - 1.00 0.37 0.45 0.46 C4 - - - 1.00 0.47 0.51 The sadness class contains the most unique C5 - - - - 1.00 0.48 words(7398), whereas the fear class contains the least(5072).