Automatically Identifying Quirky Scientific

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements Chen Shani∗, Nadav Borenstein∗, Dafna Shahaf The Hebrew University of Jerusalem fchenxshani, nadav.borenstein, [email protected] Abstract Nevertheless, computational humor remains a Humor is an important social phenomenon, long-standing challenge in AI; It requires complex serving complex social and psychological language understanding, manipulation capabilities, functions. However, despite being studied for creativity, common sense, and empathy. Some even millennia humor is computationally not well claim that computational humor is an AI-complete understood, often considered an AI-complete problem (Stock and Strapparava, 2002). problem. As humor is a broad phenomenon, most works In this work, we introduce a novel setting in on computational humor focus on specific humor humor mining: automatically detecting funny types, such as knock-knock jokes or one-liners (Mi- and unusual scientific papers. We are inspired halcea and Strapparava, 2006; Taylor and Mazlack, by the Ig Nobel prize, a satirical prize awarded annually to celebrate funny scientific achieve- 2004). In this work, we present a novel humor ments (example past winner: “Are cows more recognition task: identifying quirky, funny scien- likely to lie down the longer they stand?”). tific contributions. We are inspired by the Ig Nobel This challenging task has unique characteris- prize1, a satiric prize awarded annually to ten sci- tics that make it particularly suitable for auto- entific achievements that “first make people laugh, matic learning. and then think”. Past Ig Nobel winners include We construct a dataset containing thousands “Chickens prefer beautiful humans” and “Beauty of funny papers and use it to learn classifiers, is in the eye of the beer holder: People who think combining findings from psychology and lin- they are drunk also think they are attractive”. guistics with recent advances in NLP. We use our models to identify potentially funny papers Automatically identifying candidates for the Ig in a large dataset of over 630,000 articles. The Nobel prize provides a unique perspective on hu- results demonstrate the potential of our meth- mor. Unlike most humor recognition tasks, the ods, and more broadly the utility of integrat- humor involved is sophisticated, and requires com- ing state-of-the-art NLP methods with insights mon sense, as well as specialized knowledge and from more traditional disciplines. understanding of the scientific culture. On the other 1 Introduction hand, this task has several characteristics rendering it attractive: the funniness of the paper can often Humor is an important aspect of the way we inter- be recognized from its title alone, which is short, arXiv:2106.03048v1 [cs.CL] 6 Jun 2021 act with each other, serving complex social func- with simple syntax and no complex narrative struc- tions (Martineau, 1972). Humor can function either ture (as opposed to longer jokes). Thus, this is a as a lubricant or as an abrasive: it can be used as a relatively clean setting to explore our methods. key for improving interpersonal relations and build- We believe humor in science is also particularly ing trust (Wanzer et al., 1996; Wen et al., 2015), or interesting to explore, as humor is strongly tied to help us work through difficult topics. It can also aid creativity. Quirky contributions could sometimes in breaking taboos and holding power to account. indicate fresh perspectives and pioneering attempts Enhancing the humor capabilities of computers has to expand the frontiers of science. For example, tremendous potential to better understand interac- Andre Geim won an Ig Nobel in 2000 for levitating tions between people, as well as build more natural a frog using magnets and a Nobel Prize in Physics human-computer interfaces. ∗Equal contribution 1improbable.com/ig-about in 2010. The Nobel committee explicitly attributed violation of expectations. Example Ig Nobel win- the win to his playfulness (The Royal Swedish ners include: “Will humans swim faster or slower Academy of Science, 2010). in syrup?” and ”Coordination modes in the multi- Our contributions are: segmental dynamics of hula hooping”. • We formulate a novel humor recognition task The second category, sex-related humor is also in the scientific domain. common among Ig Nobel winning papers. Exam- • We construct a dataset containing thousands ples include: “Effect of different types of textiles of funny scientific papers. on sexual activity. Experimental study” and “Mag- • We develop multiple classifiers, combining netic resonance imaging of male and female geni- findings from psychology and linguistics with tals during coitus and female sexual arousal”. recent NLP advances. We evaluate them both Humor Detection in AI. Most computational hu- on our dataset and in a real-world setting, iden- mor detection work done in the context of AI relies tifying potential Ig Nobel candidates in a large on supervised or semi-supervised methods and fo- corpus of over 0:6M papers. cuses on specific, narrow, types of jokes or humor. • We devise a rigorous, data-driven way to ag- Humor detection is usually formulated as a bi- gregate crowd workers’ annotations for sub- nary text classification problem. Example domains jective questions. include knock-knock jokes (Taylor and Mazlack, 2 • We release data and code . 2004), one-liners (Miller et al., 2017; Simpson Beyond the tongue-in-cheek nature of our ap- et al., 2019; Liu et al., 2018; Mihalcea and Strappa- plication, we more broadly wish to promote com- rava, 2005; Blinov et al., 2019; Mihalcea and Strap- bining data-driven research with more-traditional parava, 2006), humorous tweets (Maronikolakis works in areas such as psychology. We believe in- et al., 2020; Donahue et al., 2017; Ortega-Bueno sights from such fields could complement machine et al., 2018; Zhang and Liu, 2014), humorous prod- learning models, improving performance as well uct reviews (Ziser et al., 2020; Reyes and Rosso, as enriching our understanding of the problem. 2012), TV sitcoms (Bertero and Fung, 2016), short stories (Wilmot and Keller, 2020), cartoons cap- 2 Related Work tions (Shahaf et al., 2015), and even “That’s what she said” jokes (Hossain et al., 2017; Kiddon and Humor in the Humanities. A large body of theo- Brun, 2011). Related tasks such as irony, sarcasm retical work on humor stems from linguistics and and satire have also been explored in similarly nar- psychology. Ruch(1992) divided humor into three row domains (Davidov et al., 2010; Reyes et al., categories: incongruity, sexual, and nonsense (and 2012; Ptaćekˇ et al., 2014). created a three-dimensional humor test to account for them). Since our task is to detect humor in 3 Problem Formulation and Dataset scientific contributions, we believe that the third category can be neglected under the assumption Our goal in this paper is to automatically identify that no-nonsense article would (or at least, should) candidates for the Ig Nobel prize. More precisely, be published (notable exception: the Sokal hoax to automatically detect humor in scientific papers. (Sokal, 1996)). First, we consider the question of input to our The first category, incongruity, was first fully algorithm. Sagi and Yechiam(2008) found a strong conceptualized by Kant in the eighteenth century correlation between funny title and humorous sub- (Shaw, 2010). The well-agreed extensions to in- ject in scientific papers. Motivated by this correla- congruity theory are the linguistics incongruity res- tion, we manually inspected a subset of Ig Nobel olution model and semantic script theory of humor winners. For the vast majority of them, reading the (Suls, 1972; Raskin, 1985). Both state that if a title was enough to determine whether it is funny; situation ended in a manner that contradicted our very rarely did we need to read the abstract, let prediction (in our case, the title contains an unex- alone the full paper. Typical past winners’ titles in- pected term) and there exists a different, less likely clude “Why do old men have big ears?” and “If you rule to explain it – the result is a humorous expe- drop it, should you eat it? Scientists weigh in on the rience. Simply put, the source of humor lies in 5-second rule”. An example of a non-informative title is “Pouring flows”, a paper calculating the 2github.com/nadavborenstein/Iggy optimal way to dunk a biscuit in a cup of tea. Based on this observation, we decided to focus N-gram Based LMs. We train simple N-gram on the papers’ titles. More formally: Given a title t LMs with n 2 f1; 2; 3g on two corpora – 630,000 of an article, our goal is to learn a binary function titles from Semantic Scholar, and 231,600 one-line '(t) ! f0; 1g, reflecting whether the paper is hu- jokes (Moudgil, 2016). morous, or ‘Ig Nobel-worthy’. The main challenge, Syntax-Based LMs. Here we test the hypothesis of course, lies in the construction of '. that humorous text has more surprising grammati- To take a data-driven approach to tackle this cal structure (Oaks, 1994). We replace each word problem, we crafted a first-of-its-kind dataset con- in our Semantic Scholar corpus with its correspond- taining titles of funny scientific papers2. We started ing part-of-speech (POS) tag6. We then trained from the 211 Ig Nobel winners. Next, we manually N-gram based LMs (n 2 f1; 2; 3g) on this corpus. collected humorous papers from online forums and Transformer-Based LMs. We use three different blogs3, resulting in 1,707 papers. We manually Transformers based (Vaswani et al., 2017) mod- verified all of these papers can be used as positive els: 1) BERT (Devlin et al., 2018) (pre-trained on examples. In Section6 we give more indication Wikipedia and the BookCorpus), 2) SciBERT (Belt- these papers are indeed useful for our task.

Load more