<<

WME 3.0: An Enhanced and Validated of Medical Concepts

Anupam Mondal1 Dipankar Das1 Erik Cambria2 Sivaji Bandyopadhyay1 1Department of Computer Science and Engineering 2School of Computer Science and Engineering Jadavpur University, Kolkata, India Nanyang Technological University, Singapore [email protected], [email protected] [email protected], 1sivaji cse [email protected]

Abstract However, medical text is in general unstructured since doctors do not like to fill forms and pre- Information extraction in the medical do- fer free-form notes of their observations. Hence, main is laborious and time-consuming due a lexical design is difficult due to lack of any to the insufficient number of domain- prior knowledge of medical terms and contexts. specific and lack of involve- Therefore, we are motivated to enhance a med- ment of domain experts such as doctors ical lexicon namely WordNet of Medical Events and medical practitioners. Thus, in the (WME 2.0) which helps to identify medical con- present work, we are motivated to de- cepts and their features. In order to enrich this sign a new lexicon, WME 3.0 (WordNet lexicon, we have employed various well-known of Medical Events), which contains over resources like conventional WordNet, SentiWord- 10,000 medical concepts along with their Net (Esuli and Sebastiani, 2006), SenticNet (Cam- part of speech, gloss (descriptive expla- bria et al., 2016), Bing Liu (Liu, 2012), and nations), polarity score, sentiment, sim- Taboada’s Adjective list (Taboada et al., 2011) ilar sentiment words, category, affinity and a preprocessed English medical dictionary1 on score and gravity score features. In ad- top of WME 1.0 and WME 2.0 lexicons (Mon- dition, the manual annotators help to val- dal et al., 2015; Mondal et al., 2016). WME 1.0 idate the overall as well as individual cat- contains 6415 number of medical concepts and egory level of medical concepts of WME their glosses, POS, polarity scores, and sentiment. 3.0 using Cohen’s Kappa agreement met- Thereafter, Mondal et. al., (2016) enhanced WME ric. The agreement score indicates almost 1.0 by adding few more features as affinity score, correct identification of medical concepts gravity score, and SSW to the medical concepts and their assigned features in WME 3.0. and presented as WME 2.0. The affinity and grav- ity scores present the hidden link between the pair 1 Introduction of medical concepts and the concept with the vari- In the clinical domain, the representation of a lex- ous source of glosses respectively. SSW of a med- ical resource is treated as a crucial and contribu- ical concept refers the similar sentiment words tory task because of handling several challenges. (SSW) which follow the common sentiment prop- The challenges are the identification of medical erty. concepts, their categories and relations, disam- In the current research, we have focused on en- biguation of polarities, recognition of semantics riching WME 2.0 with more number of medical whereas the scarcity of structured clinical texts concepts and including an additional feature i.e doubles the challenges. In the last few years, medical category. In order to develop such up- several researchers were involved in developing dated version of WME namely WME 3.0, we have various domain-specific lexicon such as Medical taken the help of WME 1.0 and WME 2.0. We WordNet and UMLS (Unified Medical Language have also noticed that the previous versions of System) to cope up with such challenges. These WMEs are unable to extract knowledge-based in- lexicons help to bridge the gap between medical formation such as the category of the medical con- experts such as doctors or medical practitioners cepts and its coverage is also lower. and non-experts such as patients (Cambria et al., 1http://alexabe.pbworks.com/f/Dictionary+of+Medical+Terms 2010a; Cambria et al., 2010b). +4th+Ed.-+(Malestrom).pdf Therefore, we have enhanced the number of steps of WME 3.0; Section 5 discusses the valida- medical concepts as well as add category feature tion process of the proposed lexicon; finally, Sec- on top of WME 2.0. The current version, WME tion 6 illustrates the concluding remarks and future 3.0 contains 10,186 number of medical concepts scopes of the research. and their category, POS, gloss, sentiment, polar- ity score, SSW, affinity and gravity scores. For 2 Background example, WME 3.0 lexicon presents the proper- ties of a medical concept say amnesia as of cate- Biomedical information extraction is treated as gory (disease), POS (noun), gloss (loss of memory one of the challenging research tasks as it deals sometimes including the memory of personal iden- with available medical corpora that are either un- tity due to brain injury, shock, fatigue, repression, structured or semi-structured. Hence, a domain- or illness or sometimes induced by anesthesia.), specific lexicon becomes an essential component sentiment (negative), polarity score (-0.375), SSW to convert a structured corpus from the unstruc- (memory loss, blackout, fugue, stupor), affinity tured corpus (Borthwick et al., 1998). Also, score (0.429) and gravity score (0.170). it helps in extracting the subjective and con- Moreover, to enhance and validate lexicon with ceptual information related to medical concepts the newly added medical concepts and categories, from the corpus. Various researchers have tried we have summarized our contributions as follows. to build various ontologies and lexicons such as (a) Enriching the number of medical concepts in UMLS, SNOMED-CT (Systematized Nomencla- the existing lexicon, WME 2.0: In order to meet up ture of Medicine-Clinical Terms), MWN (Medical this issue, we have employed a preprocessed En- WordNet), SentiHealth, and WordNet of Medical glish medical dictionary2 and various well-defined Events (WME 1.0 and WME 2.0) etc. in the do- lexicons such as SentiWordNet, SenticNet, and main of healthcare (Miller and Fellbaum, 1998; MedicineNet etc. They helped to enhance the Smith and Fellbaum, 2004; Asghar et al., 2016; number of medical concepts of the proposed lexi- Asghar et al., 2014). UMLS helps to enhance con. the access to biomedical literature by facilitating the development of computer systems that under- (b) Overall validation of the current lexicon: stand biomedical language (Bodenreider, 2004). To resolve the issue, we have taken the help of SNOMED-CT is a standardized, multilingual vo- two manual annotators as medical practitioners. cabulary that contains clinical terminologies and The annotators provided agreement scores that are assists in exchanging the electronic healthcare in- processed using Cohen’s Kappa and obtained a κ formation among physicians (Donnelly, 2006). score which assists in validating the overall lex- icon as well as the individual features of WME Furthermore, Fellbaum and Smith (2004) pro- 3.0 (Viera et al., 2005). posed Medical WordNet (MWN) with two sub- networks e.g., Medical FactNet (MFN) and Med- (c) Evaluate various individual feature of the ical BeliefNet (MBN) for justifying the consumer medical concepts: In order to extract the subjec- health. The MWN follows the formal architecture tive and knowledge-based features, we have ap- of the Princeton WordNet (Fellbaum, 1998). On plied our evaluation scripts on the mentioned re- the other hand, MFN aids in extracting and under- sources. The scripts assist in identifying the affin- standing the generic medical information for non- ity and gravity scores as feature values for the con- expert groups whereas MBN identifies the fraction cepts. Also, the resources are used to assign the of the beliefs about the medical phenomena (Smith SSW as semantics and glosses for the concepts. and Fellbaum, 2004). Their primary motivation On the other hand, a supervised classifier helps to was to develop a network for medical information add the category feature in the proposed lexicon. retrieval system with visualization effect. Senti- The remainder of the paper is organized as fol- Health lexicon was developed to identify the sen- lows: Section 2 presents the related works for timent for the medical concepts (Asghar et al., building a medical lexicon; Section 3 and Sec- 2016; Asghar et al., 2014). WME 1.0 and WME tion 4 describe the previous versions of WMEs 2.0 lexicons were designed to extract the medi- like WME 1.0 and WME 2.0 and the development cal concepts and their related linguistic and sen- 2http://alexabe.pbworks.com/f/Dictionary+of+Medical+ timent features from the corpus (Mondal et al., Terms+4th+Ed.-+(Malestrom).pdf 2015; Mondal et al., 2016). These mentioned ontologies and lexicons as- For example, the medical concept abnormality sist in identifying the medical concepts and their appears with the following gloss, POS as noun, sentiments from the corpus but unable to provide negative sentiment and polarity score of -0.25 in the complete knowledge-based information of the WME 1.0. concepts. Hence, in the current work, we are mo- tivated to design a full-fledged lexicon in health- 3.2 WME 2.0 care which provides the linguistic, sentiment, and The next version of WME, i.e., WME 2.0, extracts knowledge-based features together for the medical more semantic features of medical concepts (Mon- concepts. dal et al., 2016) and added with the existing fea- tures of WME 1.0. While updated WME 2.0 with 3 Attempts for WordNet of Medical affinity score, gravity score, and SSW, the num- Events ber of concepts in WME 2.0 remains same, but In healthcare, a domain-specific lexicon is the features of each concept are included (Mondal required for identifying the conceptual and et al., 2016). knowledge-based information such as category, Affinity score indicates the strength of a medi- gloss, semantics, and sentiment of the medical cal concept and its corresponding SSWs by assign- concepts from the clinical corpora (Cambria, ing a probability score. SSW of a medical con- 2016). We have borrowed the knowledge from a cept presents the SSW shared through their com- domain-specific lexicon namely WordNet of Med- mon sentiment property. The affinity score ’0’ in- ical Events (WME) with its two different versions dicates no relation whereas ’1’ suggests a strong such as WME 1.0 and WME 2.0. These versions relationship between a pair of concepts. On the are distinguished according to the versatility and other hand, gravity score helps to extract the senti- variety of medical concepts and their features. ment relevance between a concept and its glosses. It ranges from -1 to 1 including 0 while ’-1’ sug- 3.1 WME 1.0 gests no relation, ’0’ describes neutral situations of WME 1.0 contains 6415 numbers of medical con- either concept or gloss without sentiment, and ’1’ cepts and their linguistic features such as gloss, indicates strong relations either positive or nega- parts of speech (POS), sentiment and polarity tive. It is used to prove the knowledge-based rel- score (Mondal et al., 2015). The gloss and POS evance between a concept and its gloss. In order represent the descriptive definition and linguistic to extract the features, the authors used WordNet, nature of the medical concepts whereas the senti- SentiWordNet, SenticNet, and a preprocessed En- ment and polarity score refer the classes as pos- glish medical dictionary. Figure 1 shows the pre- itive, negative, and neutral and their correspond- sentation of WME 2.0 lexicon for a medical con- ing strength (+1) and weakness (-1). The resource cept abnormality. was prepared by employing the trial and train- In the present research, we have enriched the ing datasets of SemEval-2015 Task-63 which ini- number of medical concepts and category feature tially contains only 2479 medical concepts. There- with WME 2.0 lexicon and presented the enhanced after, the extracted concepts were updated us- version WME 3.0. The following section dis- ing WordNet and a preprocessed English med- cusses the steps of WME 3.0 building. ical dictionary as mentioned earlier for enrich- ing the number of concepts and identifying gloss 4 Development of WME 3.0 and POS of them. However, sentiment and po- A large number of daily produced medical corpora larity scores were added afterwards using senti- and their adaptable natures introduce the difficulty ment lexicons such as SentiWordNet4, SenticNet5, to build a full-fledged medical lexicon in health- Bing Liu’s subjective list6, and Taboada’s adjec- care domain. In order to resolve the issue, we tive list7 (Cambria et al., 2016; Taboada et al., have proposed a new version of WordNet of Med- 2011; Esuli and Sebastiani, 2006). ical Events namely WME 3.0. It is observed that 3http://alt.qcri.org/semeval2015/task6/ WME 3.0 helps to extract more medical concepts 4 http://sentiwordnet.isti.cnr.it/ and features from the unstructured corpus with re- 5http://sentic.net/downloads/ 6https://www.cs.uic.edu/ spect to the previous version of WME, i.e., WME 7http://neuro.imm.dtu.dk/wiki/ 2.0. Figure 1: An example of assigned features of a medical concept abnormality under WME 2.0 lexicon.

Another 3771 number of medical concepts and For example, the medical concept ranitidine an additional category feature were newly added represents the category, drug in WME 3.0 lexi- into WME 3.0. Finally, WME 3.0 contains 10,186 con. Table 1 illustrates a comparative analysis medical concepts and their POS, categories, affin- and progress reports on WME 1.0, WME 2.0, and ity scores, gravity scores, polarity scores, senti- WME 3.0 with respect to the coverage of medical ments and SSW. To identify the additional med- concepts, n-gram counts, and other different fea- ical concepts, we have employed the conventional tures such as POS, sentiment, polarity score, affin- WordNet8 and MedicineNet9 resource. There- ity score, gravity score, and category. after, we have written a script to extract new med- We have also noticed that the proposed WME ical concepts, which are semantically (like com- 3.0 primarily contains POS as a noun, sentiment as mon POS as well as sentiment) related with med- negative, category as disease and drug, and n-gram ical concepts of WME 2.0. Besides, SentiWord- feature as uni-grams and bi-grams. The observa- Net, SenticNet, Bing Liu subjective list, Taboada’s tions could help to understand the characteristic of adjective list, and previously mentioned prepro- the lexicon and assist in designing various applica- cessed medical dictionary help to assign all fea- tions viz. medical annotation and concept network tures except category to 3771 medical concepts systems etc. The lexicon is very much demand- which were added. ing to identify four different types of categories Thereafter, we newly considered four different and each medical concepts related gloss from a types of categories namely diseases, drugs, symp- medical corpus, which presents the difference be- toms, and human anatomy for this research af- tween WME 3.0 and already established very large ter examining the nature of medical concepts. In scale semantic networks, such as UMLS. Also, the WME 3.0, all concepts are tagged with either the lexicon-driven medical concepts and their features above-mentioned four categories or MMT cate- also assist in emulating human thought as a rec- gory. MMT represents the miscellaneous med- ommendation of medical advice, serving a poten- ical terms which refer to the uncategorized and tial foundation of a higher-order cognitive model unrecognized medical concepts. In order to as- under natural language processing (Cambria and sign the category to the medical concepts, we have Hussain, 2015; Cambria et al., 2011). Finally, the applied a well-known machine learning classifier, evaluation process of WME 3.0 as overall and its Na¨ıve Bayes on top of WME 3.0 driven features. individual feature levels are discussed in the fol- The classifier learns through the manually anno- lowing section. tated 2000 medical concepts and their categories. Thereafter, rest of 8186 medical concepts of WME 5 Evaluation 3.0 were processed by the classifier by predicting the category (Mondal et al., 2017a). In order to validate our proposed WME 3.0 lexi- con, we have conducted the following result anal- 8https://wordnet.princeton.edu/ ysis. The result shows the agreement between 9http://www.medicinenet.com/script/main/hp.asp two manual annotators to explain the acceptance Features WME 1.0 WME 2.0 WME 3.0 No. of Concepts 6415 6415 10186 Uni-gram 2956 2956 3722 n-grams Bi-gram 2837 2837 3866 Tri-gram 622 622 1762 Noun 4248 4248 7677 POS Verb 2056 2056 2352 Adjective 111 111 157 Positive (>= 1) 2800 2800 3227 Sentiment and Polarity score Negative (< 1) 3615 3615 6959 0 to 0.5 - 4325 7177 Affinity score 0.5 to 1 - 2090 3009 less than zero - 2320 3783 Gravity score equal to zero - 732 1961 grater than zero - 3363 4442 Disease - - 3243 Drug - - 3390 Category Symptom - - 1409 Human Anatomy - - 227 MMT - - 1917 Table 1: [Color online] A comparative statistics for various features of medical concepts present in WME 1.0 (Blue), WME 2.0 (Green), and WME 3.0 (Yellow).

Annotator-1 of overall lexicon as well as its individual fea- No. of Concepts: 10186 tures. The agreement has been calculated using Yes No Yes 8629 189 Cohen’s Kappa coefficient score κ which is de- Annotator-2 fined in Equation 1 (Viera et al., 2005). No 285 1083 Table 2: An agreement analysis between two an- notators to validate medical concepts and their all P r − P r κ = a e , (1) features under WME 3.0. 1 − P re where P ra is the observed proportion of full 5.2 Individual Feature based Validation of agreement between two annotators. In addition, WME 3.0 P re is the proportion expected by a chance which indicates a kind of random agreement between the On the other hand, the same annotators also as- annotators. sist in validating the individual feature of WME 3.0 with respect to the medical concepts. Hence, we have split the proposed lexicon into five parts 5.1 Overall Validation of WME 3.0 where each of the parts contains the medical con- cepts and its corresponding primary features viz. WME 3.0 has been validated by two manual an- category, POS, gloss, SSW, and sentiment individ- notators, where the annotators are medical practi- ually. We have not considered rest of the three fea- tioners. The annotators have verified both medi- tures namely affinity, gravity, and polarity scores cal concepts and their category, POS, gloss, affin- of WME 3.0 because these features were derived ity score, gravity score, polarity score, SSW, and from the above-mentioned five primary features. sentiment features and presented as a number of Thereafter, the annotators help to validate the five yes (agreed) and number of no (disagreed) values. parts by counting the number of yes (agreed) and Table 2 indicates the values provided by both of no (disagreed) individually. The provided agree- the annotators in terms of agreement-based scores. ment counts are processed with Equation 1 and The scores produced 0.79 κ score using equa- get 0.89, 0.91, 0.88, 0.82, and 0.92 κ scores for tion 1. The κ score shows significantly approved category, POS, gloss, SSW, and sentiment, respec- result for WME 3.0 lexicon. tively. Annotator-1 The κ scores prove the usefulness and quality κ score of individual features of the medical concepts for No. of Concepts Yes No Yes 2794 31 Disease (3243) 0.89 WME 3.0. Table 3 shows the agreement statistics No 51 367 between two annotators for validating the features Yes 1214 14 Symptom (1409) 0.87 of WME 3.0 lexicon. No 26 155 Yes 2922 34 Drug (3390) 0.88 Annotator-1 No 53 381 No. of Concepts: 10186 κ score Yes 196 2 Yes No Annotator-2 Human anatomy (227) 0.90 No 3 26 Yes 8778 93 Yes 1652 12 Category 0.89 MMT (1917) 0.91 No 161 1154 No 28 225 Yes 9229 52 POS 0.91 No 92 813 Table 4: An agreement analysis between two an- Yes 8805 97 notators to validate individual categories of WME Gloss 0.88 No 172 1112 3.0. Yes 8767 137 Annotator-2 SSW 0.82 No 256 1026 Yes 8727 67 sented as a full-fledged lexicon in the healthcare Sentiment 0.92 No 124 1268 domain. Also, the lexicon can take a crucial role to design various applications such as medical anno- Table 3: An agreement analysis between two an- tation, concept network, and relationship identifi- notators to validate category, POS, Gloss, SSW, cation system in healthcare (Mondal et al., 2017b). and Sentiment features of medical concepts of WME 3.0. 6 Conclusion and Future Work

We have analyzed the agreement scores for the The present task has been motivated to enrich a features of WME 3.0. It is found that all the fea- medical lexicon with additional medical concepts tures of medical concepts are quite correctly la- and a feature called category in WME 3.0. In order beled in the lexicon as presented in Table 3. We to prepare the current version, we have employed have also observed that the disagreement has been previous two versions of WME viz. WME 1.0 occurred due to the conceptual mismatch between and WME 2.0 along with various well-defined lex- two annotators or place of the usage of a few med- icons and a machine learning classifier. WME 3.0 ical concepts for each of the features. contains 10,186 medical concepts and eight differ- For example, the medical concept blood clot is ent types of useful features such as category and tagged with either symptom or disease category. In gloss etc. case of POS, the medical concept abnormality is In addition, we have also validated WME 3.0 either labeled as an adjective or a noun whereas from two different aspects, namely overall eval- menstrual cycle refers positive or negative senti- uation and usefulness of individual feature with ment. Such types of disagreements are treated as the help of two manual annotators. The annotators very difficult task for the contextual behavior of provided agreement scores that were processed us- medical corpora. ing Cohen’s kappa agreement analysis. Finally, Besides, we have studied each type of the cate- the κ scores showed the importance of WME 3.0 gories such as disease, symptom, and drug etc. to in healthcare. In future, we will attempt to en- justify their presence in WME 3.0 lexicon. The an- hance WME 3.0 with more number of medical notators again help to validate each of the assigned concepts as well as syntactic and semantic features categories using agreement analysis as shown in for improving the coverage and quality. Table 4. The supplied agreement counts have been applied on Equation 1 and we found 0.89, 0.87, References 0.88, 0.90, and 0.91 κ scores for disease, symp- tom, drug, human anatomy, and MMT categories, Muhammad Z Asghar, Aurangzeb Khan, Fazal M respectively. Kundi, Maria Qasim, Furqan Khan, Rahman Ullah, and Irfan U Nawaz. 2014. Medical opinion lex- Finally, we can conclude that, WME 3.0 lexi- icon: an incremental model for mining health re- con assists in increasing the coverage of the med- views. International Journal of Academic Research, ical concepts as well as features and may be pre- 6(1):295–302. Muhammad Zubair Asghar, Shakeel Ahmad, Maria George Miller and Christiane Fellbaum. 1998. Word- Qasim, Syeda Rabail Zahra, and Fazal Masud Net: An electronic lexical database. Kundi. 2016. SentiHealth: creating health-related sentiment lexicon using hybrid approach. Springer- Anupam Mondal, Iti Chaturvedi, Dipankar Das, Rajiv Plus, 5(1):1139. Bajpai, and Sivaji Bandyopadhyay. 2015. Lexi- cal Resource for Medical Events: A Polarity Based Olivier Bodenreider. 2004. The unified medical Approach. In 2015 IEEE International Conference language system (UMLS): integrating biomed- on Data Mining Workshop (ICDMW), pages 1302– ical terminology. Nucleic acids research, 1309. IEEE. 32(suppl 1):D267–D270. Anupam Mondal, Dipankar Das, Erik Cambria, and Andrew Borthwick, John Sterling, Eugene Agichtein, Sivaji Bandyopadhyay. 2016. WME: Sense, Polar- and Ralph Grishman. 1998. Exploiting diverse ity and Affinity based Concept Resource for Medical knowledge sources via maximum entropy in named Events. Proceedings of the Eighth Global WordNet entity recognition. In Proc. of the Sixth Workshop Conference, pages 242–246. on Very Large Corpora, volume 182. Anupam Mondal, Erik Cambria, Dipankar Das, and Erik Cambria and Amir Hussain. 2015. Sentic Com- Sivaji Bandyopadhyay. 2017a. Auto-categorization puting: A Common-Sense-Based Framework for of medical concepts and contexts. In IEEE Sym- Concept-Level Sentiment Analysis. Springer, Cham, posium Series on Computational Intelligence (SSCI Switzerland. 2017), Honolulu, Hawaii, USA.

Erik Cambria, Amir Hussain, Tariq Durrani, Catherine Anupam Mondal, Erik Cambria, Dipankar Das, and Havasi, Chris Eckl, and James Munro. 2010a. Sen- Sivaji Bandyopadhyay. 2017b. MediConceptNet: tic computing for patient centered applications. In An Affinity Score Based Medical Concept Network. IEEE 10th International Conference on Signal Pro- In Proceedings of the Thirtieth International Florida cessing Proceedings, pages 1279–1282. IEEE. Artificial Intelligence Research Society Conference, FLAIRS 2017, Marco Island, Florida, USA, May 22- Erik Cambria, Amir Hussain, Catherine Havasi, Chris 24, 2017., pages 335–340. Eckl, and James Munro. 2010b. Towards crowd val- idation of the UK national health service. In WebSci, Barry Smith and Christiane Fellbaum. 2004. Medi- Raleigh. cal WordNet: a new methodology for the construc- tion and validation of information resources for con- Erik Cambria, Thomas Mazzocco, Amir Hussain, and sumer health. In Proceedings of the 20th inter- Chris Eckl. 2011. Sentic medoids: Organiz- national conference on Computational Linguistics, ing affective common sense knowledge in a multi- page 371. Association for Computational Linguis- dimensional vector space. In D Liu, H Zhang, tics. M Polycarpou, C Alippi, and H He, editors, Ad- vances in Neural Networks, volume 6677 of Lecture Maite Taboada, Julian Brooke, Milan Tofiloski, Kim- Notes in Computer Science, pages 601–610, Berlin. berly Voll, and Manfred Stede. 2011. Lexicon- Springer-Verlag. based methods for sentiment analysis. Computa- tional linguistics, 37(2):267–307. Erik Cambria, Soujanya Poria, Rajiv Bajpai, and Bjorn¨ W Schuller. 2016. SenticNet 4: A Semantic Anthony J Viera, Joanne M Garrett, et al. 2005. Under- Resource for Sentiment Analysis Based on Concep- standing inter-observer agreement: the kappa statis- tual Primitives. In COLING, pages 2666–2677. tic. Fam Med, 37(5):360–363.

Erik Cambria. 2016. Affective computing and senti- ment analysis. IEEE Intelligent Systems, 31(2):102– 107.

Kevin Donnelly. 2006. SNOMED-CT: The advanced terminology and coding system for eHealth. Studies in health technology and informatics, 121:279.

Andrea Esuli and Fabrizio Sebastiani. 2006. Senti- WordNet: A publicly available lexical resource for opinion mining. In Proceedings of LREC, volume 6, pages 417–422. Citeseer.

Christiane Fellbaum. 1998. WordNet. Wiley Online Library.

Bing Liu. 2012. Sentiment analysis and opinion min- ing. Synthesis lectures on human language tech- nologies, 5(1):1–167.