Computational Humor
Total Page:16
File Type:pdf, Size:1020Kb
Making Computers Laugh: Investigations in Automatic Humor Recognition Rada Mihalcea Carlo Strapparava Department of Computer Science Istituto per la Ricerca Scientifica e Tecnologica University of North Texas ITC – irst Denton, TX, 76203, USA I-38050, Povo, Trento, Italy [email protected] [email protected] Abstract Previous work in computational humor has fo- cused mainly on the task of humor generation (Stock Humor is one of the most interesting and and Strapparava, 2003; Binsted and Ritchie, 1997), puzzling aspects of human behavior. De- and very few attempts have been made to develop spite the attention it has received in fields systems for automatic humor recognition (Taylor such as philosophy, linguistics, and psy- and Mazlack, 2004). This is not surprising, since, chology, there have been only few at- from a computational perspective, humor recogni- tempts to create computational models for tion appears to be significantly more subtle and dif- humor recognition or generation. In this ficult than humor generation. paper, we bring empirical evidence that In this paper, we explore the applicability of computational approaches can be success- computational approaches to the recognition of ver- fully applied to the task of humor recogni- bally expressed humor. In particular, we investigate tion. Through experiments performed on whether automatic classification techniques are a vi- very large data sets, we show that auto- able approach to distinguish between humorous and matic classification techniques can be ef- non-humorous text, and we bring empirical evidence fectively used to distinguish between hu- in support of this hypothesis through experiments morous and non-humorous texts, with sig- performed on very large data sets. nificant improvements observed over apri- Since a deep comprehension of humor in all of ori known baselines. its aspects is probably too ambitious and beyond the existing computational capabilities, we chose 1 Introduction to restrict our investigation to the type of humor found in one-liners. A one-liner is a short sen- ... pleasure has probably been the main goal all along. But I hesitate tence with comic effects and an interesting linguistic to admit it, because computer scientists want to maintain their image as hard-working individuals who deserve high salaries. Sooner or structure: simple syntax, deliberate use of rhetoric later society will realize that certain kinds of hard work are in fact devices (e.g. alliteration, rhyme), and frequent use admirable even though they are more fun than just about anything else. (Knuth, 1993) of creative language constructions meant to attract Humor is an essential element in personal com- the readers attention. While longer jokes can have munication. While it is merely considered a way a relatively complex narrative structure, a one-liner to induce amusement, humor also has a positive ef- must produce the humorous effect “in one shot”, fect on the mental state of those using it and has the with very few words. These characteristics make ability to improve their activity. Therefore computa- this type of humor particularly suitable for use in an tional humor deserves particular attention, as it has automatic learning setting, as the humor-producing the potential of changing computers into a creative features are guaranteed to be present in the first (and and motivational tool for human activity (Stock et only) sentence. al., 2002; Nijholt et al., 2003). We attempt to formulate the humor-recognition problem as a traditional classification task, and feed automatically identified positive (humorous) and negative (non-humorous) one−liners examples to an automatic classifier. The humor- seed one−liners ous data set consists of one-liners collected from Web search the Web using an automatic bootstrapping process. The non-humorous data is selected such that it webpages matching is structurally and stylistically similar to the one- thematic constraint (1)? liners. Specifically, we use three different nega- yes candidate tive data sets: (1) Reuters news titles; (2) proverbs; webpages and (3) sentences from the British National Corpus (BNC). The classification results are encouraging, enumerations matching with accuracy figures ranging from 79.15% (One- stylistic constraint (2)? liners/BNC) to 96.95% (One-liners/Reuters). Re- yes gardless of the non-humorous data set playing the role of negative examples, the performance of the Figure 1: Web-based bootstrapping of one-liners. automatically learned humor-recognizer is always significantly better than apriori known baselines. The remainder of the paper is organized as fol- struction of a very large one-liner data set may be lows. We first describe the humorous and non- however problematic, since most Web sites or mail- humorous data sets, and provide details on the Web- ing lists that make available such jokes do not usu- based bootstrapping process used to build a very ally list more than 50–100 one-liners. To tackle this large collection of one-liners. We then show experi- problem, we implemented a Web-based bootstrap- mental results obtained on these data sets using sev- ping algorithm able to automatically collect a large eral heuristics and two different text classifiers. Fi- number of one-liners starting with a short seed list, nally, we conclude with a discussion and directions consisting of a few one-liners manually identified. for future work. The bootstrapping process is illustrated in Figure 2 Humorous and Non-humorous Data Sets 1. Starting with the seed set, the algorithm auto- matically identifies a list of webpages that include at To test our hypothesis that automatic classification least one of the seed one-liners, via a simple search techniques represent a viable approach to humor performed with a Web search engine. Next, the web- recognition, we needed in the first place a data set pages found in this way are HTML parsed, and ad- consisting of both humorous (positive) and non- ditional one-liners are automatically identified and humorous (negative) examples. Such data sets can added to the seed set. The process is repeated sev- be used to automatically learn computational mod- eral times, until enough one-liners are collected. els for humor recognition, and at the same time eval- An important aspect of any bootstrapping algo- uate the performance of such models. rithm is the set of constraints used to steer the pro- cess and prevent as much as possible the addition of 2.1 Humorous Data noisy entries. Our algorithm uses: (1) a thematic For reasons outlined earlier, we restrict our attention constraint applied to the theme of each webpage; to one-liners, short humorous sentences that have the and (2) a structural constraint, exploiting HTML an- characteristic of producing a comic effect in very notations indicating text of similar genre. few words (usually 15 or less). The one-liners hu- The first constraint is implemented using a set mor style is illustrated in Table 1, which shows three of keywords of which at least one has to appear examples of such one-sentence jokes. in the URL of a retrieved webpage, thus poten- It is well-known that large amounts of training tially limiting the content of the webpage to a data have the potential of improving the accuracy of theme related to that keyword. The set of key- the learning process, and at the same time provide words used in the current implementation consists insights into how increasingly larger data sets can of six words that explicitly indicate humor-related affect the classification precision. The manual con- content: oneliner, one-liner, humor, humour, joke, One-liners to the one-liners. We do not want the automatic clas- Take my advice; I don’t use it anyway. I get enough exercise just pushing my luck. sifiers to learn to distinguish between humorous and Beauty is in the eye of the beer holder. non-humorous examples based simply on text length Reuters titles or obvious vocabulary differences. Instead, we seek Trocadero expects tripling of revenues. Silver fixes at two-month high, but gold lags. to enforce the classifiers to identify humor-specific Oil prices slip as refiners shop for bargains. features, by supplying them with negative examples BNC sentences similar in most of their aspects to the positive exam- They were like spirits, and I loved them. I wonder if there is some contradiction here. ples, but different in their comic effect. The train arrives three minutes early. We tested three different sets of negative exam- Proverbs Creativity is more important than knowledge. ples, with three examples from each data set illus- Beauty is in the eye of the beholder. trated in Table 1. All non-humorous examples are I believe no tales from an enemy’s tongue. enforced to follow the same length restriction as the Table 1: Sample examples of one-liners, Reuters ti- one-liners, i.e. one sentence with an average length tles, BNC sentences, and proverbs. of 10–15 words. 1. Reuters titles, extracted from news articles pub- funny. For example, http://www.berro.com/Jokes lished in the Reuters newswire over a period of or http://www.mutedfaith.com/funny/life.htm are the one year (8/20/1996 – 8/19/1997) (Lewis et al., URLs of two webpages that satisfy this constraint. 2004). The titles consist of short sentences with The second constraint is designed to exploit the simple syntax, and are often phrased to catch HTML structure of webpages, in an attempt to iden- the readers attention (an effect similar to the tify enumerations of texts that include the seed one- one rendered by one-liners). liner. This is based on the hypothesis that enumer- 2. Proverbs extracted from an online proverb col- ations typically include texts of similar genre, and lection. Proverbs are sayings that transmit, usu- thus a list including the seed one-liner is likely to ally in one short sentence, important facts or include additional one-line jokes.