Tutorial: ‘Computational Sarcasm’

Pushpak Bhattacharyya1 Aditya Joshi1,2,3 1IIT Bombay, India 2Monash University, Australia 3IITB-Monash Research Academy, India {pb, adityaj}@cse.iitb.ac.in

1 Description timent analysis, etc. The tutorial is motivated by our continually evolving survey paper of sarcasm Sarcasm is a form of verbal irony that is intended detection, that is available on arXiv at: Joshi, to express contempt or ridicule. Motivated by chal- Aditya, Pushpak Bhattacharyya, and Mark James lenges posed by sarcastic text to sentiment analy- Carman. ”Automatic Sarcasm Detection: A Sur- sis, computational approaches to sarcasm have wit- vey.” arXiv preprint arXiv:1602.03426 (2016). nessed a growing interest at NLP forums in the past decade. Computational sarcasm refers to au- 2 Tutorial Outline tomatic approaches pertaining to sarcasm. The tu- torial will provide a bird’s-eye view of the research This outline has been planned for 180 minutes. (95 in computational sarcasm for text, while focusing minutes + 85 minutes = 180 minutes) on significant milestones. 1. Introduction (15 minutes) The tutorial begins with linguistic theories of sarcasm, with a focus on incongruity: a useful no- • Sentiment analysis & Computational Sar- tion that underlies sarcasm and other forms of figu- casm (3 minutes) rative language. Since the most significant work in • Motivation: Application to chatbots and computational sarcasm is sarcasm detection: pre- sentiment analysis (3 minutes) dicting whether a given piece of text is sarcastic or • Challenges of computational sarcasm (5 not, sarcasm detection forms the focus hereafter. minutes) We begin our discussion on sarcasm detection with • Scope of the tutorial (4 minutes) datasets, touching on strategies, challenges and na- ture of datasets. Then, we describe algorithms for 2. Linguistic Foundations of Sarcasm (15 min- sarcasm detection: rule-based (where a specific ev- utes) idence of sarcasm is utilised as a rule), statisti- • Linguistic formulations of sarcasm (5 cal classifier-based (where features are designed for minutes) a statistical classifier), a topic model-based tech- nique, and deep learning-based algorithms for sar- • Relationship between sarcasm, irony, de- casm detection. In case of each of these algo- ception, humor and metaphor (5 minutes) rithms, we refer to our work on sarcasm detection • Incongruity as a key indicator of sarcasm and share our learnings. Since information beyond (5 minutes) the text to be classified, contextual information is useful for sarcasm detection, we then describe ap- 3. Datasets for Sarcasm detection (15 minutes) proaches that use such information through con- • Nature of datasets (5 minutes) versational context or author-specific context. • Strategies of annotation (5 minutes) We then follow it by novel areas in computa- • Challenges in obtaining labeled datasets tional sarcasm such as sarcasm generation, sarcasm (hashtag-based supervision, impact of v/s irony classification, etc. We then summarise non-native annotators, etc.) (5 minutes) the tutorial and describe future directions based on errors reported in past work. The tutorial will 4. Algorithms for Sarcasm detection - 1 (50 min- end with a demonstration of our work on sarcasm utes) detection. • This tutorial will be of interest to researchers in- vestigating computational sarcasm and related ar- • Rule-based algorithms (15 minutes) eas such as computational humour, figurative lan- – Past rule-based algorithms guage understanding, emotion and sentiment sen- – Our work on rule-based algorithms • Statistical classifier-based algorithms (35 University Joseph Fouriere (France). Prof. Bhat- minutes) tacharyya’s research areas are Natural Language – Features Processing, Machine Learning and AI. He has – Learning algorithms guided more than 250 students (PhD, masters – Our work on classifier-based algo- and Bachelors), has published more than 250 rithms research papers and led government and industry projects of international and national importance. 5. -Coffee Break- A significant contribution of his is Multilingual Lexical Knowledge Bases and Projection. Author 6. Algorithms for Sarcasm detection - 2 (30 min- of the text book ‘’ Prof. utes) Bhattacharyya is loved by his students for his • Topic modeling-based algorithms (10 inspiring teaching and mentorship. He is a Fellow minutes) of National Academy of Engineering and recipient • Deep learning-based algorithms for sar- of Patwardhan Award of IIT Bombay and VNMM casm detection (20 minutes) award of IIT Roorkey- both for technology de- velopment, and faculty grants of IBM, Microsoft, – Past deep learning-based work Yahoo and United Nations. – Our work on deep learning-based al- gorithms 2. Aditya Joshi 7. Contextual Information for Sarcasm Detec- IITB-Monash Research Academy, Mumbai, India. tion (15 minutes) (IIT Bombay, India, & Monash University, Aus- tralia) • Conversational context (10 minutes) [email protected], [email protected] • Author’s historical context (5 minutes) http://www.cse.iitb.ac.in/~adityaj/

8. Beyond Sarcasm Detection (20 minutes) Aditya Joshi is a PhD student at IITB-Monash Re- • Sarcasm versus irony detection (6 min- search Academy, a joint PhD programme between utes) Indian Institute of Technology Bombay, India and • Sarcasm generation (6 minutes) Monash University, Australia, since January 2013. His PhD advisors are Pushpak Bhattacharyya • Sarcasm Target Identification (8 minutes) (IITB) and Mark Carman (Monash). His primary 9. Summary (5 minutes) research focus is computational sarcasm where he has explored different ways in which incongruity 10. Future Directions (10 minutes) can be captured in order to detect and generate sarcasm. In addition, he has worked on innovative • Peculiar forms of sarcasm (pointers to applications of NLP such as sentiment analysis for sarcasm detection based on error analy- Indian languages, drunk-texting prediction, news sis in past work.) (5 minutes) headline translation, political issue extraction, etc. • Contextual information for sarcasm (5 minutes)

11. Demonstration (5 minutes) 3 Speaker Profiles 1. Dr. Pushpak Bhattacharyya Indian Institute of Technology Bombay, Mumbai, India & Indian Institute of Technology Patna, Patna, India. [email protected], [email protected] www.cse.iitb.ac.in/~pb/

Prof. Pushpak Bhattacharyya is the current President of ACL (2016-17). He is the Director of IIT Patna and Vijay and Sita Vashee Chair Professor in IIT Bombay, Computer Science and Engineering Department. He was educated in IIT Kharagpur (B.Tech), IIT Kanpur (M.Tech) and IIT Bombay (PhD). He has been visiting scholar and faculty in MIT, Stanford, UT Houston and