Trends & Controversies

Steven Staab University of Karlsruhe Computational Humor [email protected] Kim Binsted, University of Hawaii

No, this is no April Fool’s prank. scientists at labs around the world are conducting serious research into … humor. Although it might seem whimsical, many excellent reasons exist to take a closer look at this fascinating aspect of human cognition and interaction. cally addressed language phenomena that are amenable to Humor affects attention and memory, facilitates social interaction, combinatorial approaches using static and stereotypical and ameliorates communication problems. If are ever semantic representations. Although such approaches are going to communicate naturally and effectively with humans, they adequate for much of language, they’re not easily extended must be able to use humor. Moreover, humor provides insight into to capture humans’ more creative language interpretation how humans process language—real, complex, creative language, capacities. An alternative tack is to begin by modeling less not just a tractable subset of standard sentences. By modeling humor typical, more complex phenomena, with the goal of generation and understanding on computers, we can gain a better encompassing standard language as a trivial case.1 picture of how the human brain handles not just humor but language and cognition in general. Semantic interpretation In popular perception, computers will never be able to use or Suppose you hear someone saying, “Everyone had so appreciate humor. Fictional computers and robots are almost always much fun diving from the tree into the swimming pool, we portrayed as humorless, even if they’re skilled at natural language, decided to put in a little … ” At the point in the sentence graceful bipedal motion, and other human behaviors that current where you hear the words “put in,” you’ve already com- machines find challenging. Then again, chess too was once thought mitted to an interpretation of the clause—probably that the to be the sole domain of humans, and now computer programs play owners are installing a diving board. Indeed, psycholin- at the grandmaster level. guistic research suggests that human sentence processing These four articles focus on different aspects and applications of is both incremental and predictive, because people inte- humor. Benjamin Bergen and Seana Coulson propose a model of grate perceptual input with linguistic and conceptual infor- humor comprehension based on frame-shifting within a simulation- mation at multiple levels of representation.2 Neuroimaging based natural-language-understanding system. Anton Nijholt research with magnetoencephalography suggests that after describes how embodied agents can use humor to make human- an initial modality-specific processing stage of approxi- agent interaction more effective and believable. Oliviero Stock and mately 200 milliseconds, speech processing is subserved Carlo Strapparava describe humor’s effects on attention and mem- by a bilateral network of inferior prefrontal and temporal ory and describe an application of computational humor to adver- lobe regions that are simultaneously active for hundreds of tising. Finally, Graeme Ritchie, Ruli Manurung, Helen Pain, Annalu milliseconds.3 Such data suggest that the processes of lexi- Waller, and Dave O’Mara describe an interactive riddle builder cal access, semantic association, and contextual integration called Standup, which helps children with communication-related are simultaneous, as represented in cascade models. disabilities use humor to interact with other children. One day, per- Besides its empirical motivation, incremental semantic haps, thanks to the groundbreaking research these articles describe, processing has computational benefits for word recogni- computers will be able to devise their own April Fool’s pranks. tion and contextual integration. Word recognition in nat- ural speech, for example, is extremely challenging because of extensive variability in the acoustic input. Top-down semantic information greatly facilitates segmentation of Frame-Shifting Humor in Simulation- the sound stream. Moreover, spoken language is produced Based Language Understanding quickly—typically at about two to three words per sec- Benjamin Bergen, University of Hawaii, Manoa ond—which necessitates parallel processing of linguistic Seana Coulson, University of California, San Diego cues and their semantic referents. Besides easing word recognition, activating the correct frame of reference greatly In an effort to focus on tractable problems, computa- facilitates contextual integration. For instance, in the swim- tional natural-language-understanding systems have typi- ming pool example, knowledge of diving and the typical

22 22 1541-1672/06/$20.001094-7167/03/$17.00 © © 2003 2006 IEEE IEEE IEEE INTELLIGENT SYSTEMS Published by the IEEE Computer Society backyard pool lets you more easily recog- nize and integrate expected items, such as a Form Meaning diving board, into the scenario. Phonological schemasConstructions Conceptual schemas However, one thing that differentiates human sentence processing from many com- Utterance Communicative context putational systems is humans’ capacity to deal with unexpected input, even when it Analysis necessitates revising information that they’ve already assembled. People, for example, are readily able to interpret the sentence “Every- Semantic specification Simulation Inferences one had so much fun diving from the tree into the swimming pool, we decided to put in a little water.” The word “water” prompts you to revise the default assumption that the pool had water in it. Revising this simple assump- Figure 1. Overview of the Embodied Construction Grammar architecture with two core processes, analysis and simulation.2 tion has substantial implications for the con- sequences of diving from the tree into the pool, and for the mindset of those who enjoy such activities. This reanalysis process, Prepare Launch Flight Enabled Ready Ongoing Done known as frame-shifting, highlights the need Move Propel Position for dynamic inferencing in language inter- into using body for pretation and has been argued to be a test position legs flight and landing case for models of meaning construction.1 Natural-language-understanding sys- tems might achieve the flexible, interpre- tive capacity necessary for frame-shifting Prepare Launched by adopting an empirically inspired archi- for towards tecture that is based on dynamic internal launch goal imagery. In such models, language inter- Propel pretation involves creating an internal sim- towards ulation of events that includes the sensory, goal motor, and affective dimensions. Research suggests that the neural systems responsi- Figure 2. A simplified Dive X-schema. ble for performing actions or perceiving percepts are also recruited for linguistically inspired simulations. In other words, like ECG language-understanding system has complex action, will be a particular instantia- dreaming, recalling, and imagining, lan- two main processes: analysis and simulation tion of this general X-schema. In the case of guage processing uses human perceptual (see figure 1). The analysis system parses diving, instances will differ in the agent who and motor systems as internal models that each input utterance into constituent words performs the dive, the dive’s target, the force allow for the construction of subjective and syntactic structures, using a representa- exerted, the trajectory, and so on. The lan- experiences in the absence of motor action tion of the current communicative context as guage system must supply these parame- or perceptual input.4 Because our experi- well as stored pairings of phonological and ters of the Dive X-schema, but it can use ences with pools almost without exception conceptual knowledge known as construc- default values in their absence. The simula- include water, water will automatically be tions. The simulation system runs dynamic tion system depends on the analysis system activated in mental simulations that involve simulations of those utterances’ content, to provide it with information about which pools. Our experience with diving, by con- producing inferences and updating beliefs X-schemas to run, what parameterizations trast, presumably involves some cases of about the world. to assign to them, and how to bind them landing on a solid surface, thus enabling us The ECG simulation system uses a rep- together. to viscerally imagine diving into a pool resentational formalism known as the X- The analysis system uses linguistic input with no water. schema. X-schemas are based on stochastic to produce a specification of the mechanisms petri nets and have numerous desirable prop- the simulation system uses. This parameter- Simulation-based natural- erties. They are dynamic and probabilistic ized interface is known as the semantic spec- language-understanding and allow for both parallel processing and ification (see figure 1). The inputs to the systems hierarchical structure. Figure 2 shows a analysis system are contextually activated One natural-language-understanding sys- simplified representation of the Dive X- schemas, along with the words and other tem that employs a simulation-based archi- schema—that is, the machinery used to vir- constructions the utterance activates; con- tecture was developed under the rubric of tually perform or mentally simulate diving. structions map aspects of form to aspects of Embodied Construction Grammar.5 The Any individual diving event, as with any meaning. As we noted earlier, the output of

MARCH/APRIL 2006 www.computer.org/intelligent 23 analysis is the semantic specification. For sentence activates lexical constructions However, unless heavy contextual biasing example, the word “diving” maps from a representing words—“everyone,” “had,” exists (for instance, if the system previously sound or character sequence to the X-schema “so,” and so on—as well as larger gram- knows the pool to be full of jello), the simu- representation for diving (see figure 2), matical constructions such as noun phrases lation system can’t coherently maintain the just as the word “tree” will map its sound or (“the swimming pool”) and prepositional simulation it constructed for the first clause spelling to a parameterization of how to simu- phrases (“from the tree”). as a predecessor of the simulation for the late a tree. Larger grammatical constructions The semantic side of these assembled second clause; without water in the pool, indicate how these schematic, parameterized networks of constructions contains precisely people can’t have been diving normally into representations of simulatable meaning are the parameterized schematic representations it. Because the simulation system can’t bound together. Consequently, an English the simulation device needs. Once a single establish a causal or temporal relationship speaker knows that the subject of the verb analysis surpasses a defined certainty thresh- between the two partial simulations it per- “dive” will be the agent performing the dive, old, the analysis process reads its semantic formed for the two clauses, it fails and calls and that the subsequent prepositional phrases specification into the simulation engine. So, on the analysis system to provide a new describe the locus of its starting and ending after the first clause, the analysis process semantic specification. The success of a points. passes the single best analysis into simula- semantic specification that calls for a differ- The processes in the ECG system are tion. The simulator runs an internal simula- ent simulation in the first clause—namely a interdependent: the simulation system tion of this experience, activating the appro- slightly different X-schema for the first depends on input from the analysis system, priate X-schema with the parameterizations event—results in a coherent relationship and defective simulation can force a novel determined through analysis, such as who between the two clauses’ simulations. analysis. Moreover, the processes proceed was performing the action, with what tra- incrementally. Critically, the simulation sys- Our system’s strengths tem engages as early as possible and need Although this example is a mere sketch not wait for the analysis process to terminate. of the processing that occurs in a simula- Indeed, as we argued earlier, simulation must To interact as a human-like tion-based natural-language-understanding begin as soon as a coherent chunk of a system, it illustrates several important fea- semantic specification has been assembled. conversational agent, a natural- tures of this approach. First, using detailed semantic interpretation in the form of simu- Computational humor language-understanding system lation is vital to correctly identifying seman- processing tic incoherencies, because nothing about the Humans naturally and frequently pro- must be able to deal appropriately humorous utterance’s linguistic form itself— duce humorous language. So, to interact as the words it uses or their formal combina- a human-like conversational agent or to tion—is conceptually discrepant in any way. serve as a scientific model of human lan- with linguistic humor. Second, the early (utterance-internal) com- guage understanding, a natural-language- mitment to semantic interpretation predicts understanding system must be able to deal differences in the processing of humorous appropriately with linguistic humor. Early utterances that require frame-shifting from simulation is a key feature that facilitates those that are merely ambiguous. For in- humor processing in computational natural- jectory, and so on. The system fills in those stance, in processing the sentence “That language-understanding systems. Here’s an aspects of simulation not specified from the book sounds so great I’m going to dive example of how the ECG system deals with language with default or contextually infer- right into it,” the word “dive” has a differ- an utterance such as “Everyone had so much able values, so it represents the pool as ent sense than when it describes physical fun diving from the tree into the swimming filled with water. The simulation generates diving. However, in this case, prior linguis- pool, we decided to put in a little water.” appropriate affective and encyclopedic infer- tic context (and simulation) biases the ences so it can update its beliefs about the processor toward the correct interpretation. A processing example world—people in the scene were using the Third, a simulation-based model uses real- First, assuming for simplification that swimming pool in its normal function, that world knowledge to capture subtle semantic the input is in text form rather than spoken, is, diving in, getting wet, and enjoying it. differences in the meanings of particular analysis begins from left to right (starting As processing continues, however, the words that arise in the simulation and not in with the first word) and hypothesizes lexi- system is confronted with a semantic inco- the semantic specification. For example, cal and larger linguistic constructions that herency. Namely, when it encounters the only the simulation can distinguish between could account for the input data, using a word “water” at the end of the sentence and the diving you would do into a pool and on a chart-parsing algorithm.6 As the system reads simulates the second clause’s content, the football field, or between diving into a pool in each new word, the analysis process main- result is inconsistent with the previous sim- with versus without water. Finally, the tains numerous competing candidate sets ulation. The highest-level construction in process of misunderstanding and correc- of lexical and grammatical constructions. the sentence, the “X is so Y that Z” con- tion—committing to a simulation only to Each coherent candidate set includes not struction, indicates that the events the two be confronted with an incoherency that needs only these constructions, but also bindings clauses describe (the enjoyment of the div- correcting—drives the humorous effect the among them. For instance, the example ing and the putting in of water) are related. utterance has on humans. A simulation-

24 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS based architecture exhibits similar behavior. and rational intelligence in an embodied Embodied agents using humor human-like agent (that is, an agent visualized Embodied conversational agents (ECAs) Conclusions onscreen as an animated character) allows have been introduced to play, among oth- As natural-language-understanding sys- natural interaction between the human and ers, the role of conversational partner for tems approach human-like conversational the agent that represents the system the the computer user. Rather than addressing competence, a main challenge is dealing human is communicating with. Research in the machine, the user addresses virtual with language that requires deep concep- intelligent agents includes reasoning about agents that have particular capabilities and tual knowledge about the details of human beliefs, desires, and intentions. Apart from are responsible for certain tasks. The user experience, as humorous language often contextual constraints that guide the agent’s might interact with ECAs to engage in an does. Incorporating simulation devices into reasoning and behavior, other behavioral information service dialogue or a transaction, such applications offers can begin to solve constraints exist that follow from models to solve a problem cooperatively, to perform the problem of how to use knowledge of that describe emotions. These models assume a task, or to engage in a virtual meeting. In the world to improve human-machine com- that emotions emerge based on appraisals of these kinds of situations, humans use humor munication. Computational tools devel- events taking place in the environment and to ease communication problems. In a sim- oped to deal with the complexities of lin- how these events affect goals that the agents ilar way, humor can help solve communi- guistic humor may also be applicable to are pursuing. In current research, it’s also not cation problems that arise within human- other challenging problems in natural lan- unusual to incorporate personality models ECA interaction. guage interpretation. in agents to adapt this appraisal process as Researchers working on the Computers well as reasoning, behavior, and display of Are Social Actors (CASA) paradigm4 have References emotions to personality characteristics. So, convincingly demonstrated that people inter- act with computers as if they were social 1. S. Coulson, Semantic Leaps: Frame-Shifting actors. Depending on the way we can pro- and Conceptual Blending in Meaning Con- struction, Cambridge Univ. Press, 2001. gram a computer to interact, people might Humor helps to regulate a find it polite, dominant, extroverted, intro- 2. Y. Kamide, G.T.M. Altmann, and S.L. Hay- verted, or any other attitudes or personality wood, “The Timecourse of Prediction in conversation and can help to traits we can think of. Moreover, people Incremental Sentence Processing: Evidence from Anticipatory Eye Movements,” J. Mem- react to these attitudes and traits as if a ory and Language, vol. 49, no. 1, 2003, pp. establish common ground between human being were displaying them. From 133–156. the CASA experiments, we can extrapolate conversational partners. It makes that humor, because of its role in human- 3. K. Marinkovic, “Spatiotemporal Dynamics human interaction, can play an important of Word Processing in the Human Cortex,” Neuroscientist, vol. 10, no. 2, 2004, pp. conversation enjoyable and role in human-computer interaction. Exper- 142–152. iments examining humor’s effects in task- oriented computer-mediated communica- 4. L.W. Barsalou, “Perceptual Symbol Sys- supports interpersonal attraction. tion and in human-computer interaction tems,” Behavioral and Brain Sciences, vol. 22, no. 4, 1999, pp. 577–609. have confirmed this. More and more, we see ECAs employed 5. B. Bergen and N. Chang, “Embodied Con- we can model lots of useful and human-like in 2D or 3D virtual-reality educational and struction Grammar in Simulation-Based Lan- properties in artificial agents, but, in Roddy entertainment environments, e-commerce guage Understanding,” Construction Gram- Cowie’s words, “… If they are going to show applications, and training and simulation mars: Cognitive Grounding and Theoretical 5 Extensions, J. Ostman and M. Fried, eds.,. emotion, we surely hope that they would environments. Research projects suggest John Benjamins, 2005, pp. 147–190. show a little humor too.”1 that in the near feature, we might expect that What are the benefits of using humor? in addition to being domain and environment 6. J. Bryant, “Constructional Analysis,” master’s Humor helps to regulate a conversation experts, ECAs will act as personal assistants, thesis, Computer Science Dept., Univ. of Cal- ifornia, Berkeley, 2004. and can help to establish common ground coaches, and buddies. They will accompany between conversational partners. It makes their human partners, migrating from dis- conversation enjoyable and supports inter- plays on handheld devices to displays personal attraction.2 According to Deborah embedded in ambient-intelligence environ- Embodied Conversational Tannen, humor makes your presence felt.3 ments. Natural interaction with these ECAs Agents: “A Little Humor Too” Many researchers have also mentioned its will require them to display rational and Anton Nijholt, University of Twente benefits in teaching and learning and have social intelligence and, indeed, also a little made this role explicit in experiments. humor when appropriate and enjoyable. Social and intelligent agents have Humor contributes to motivation, attention, become a leading paradigm for describing comprehension and retention of informa- Computational humor and solving problems in human-like ways. tion, and the development of affective feel- Well-known philosophers and psycholo- In situations where it’s useful to design ings toward content. It helps create a more gists have contributed their viewpoints to direct communication between agents and pleasurable learning experience, fosters cre- the theory of humor. Sigmund Freud saw their human partners, the display of social ative thinking, reduces anxiety, and so on. humor as a release of tension and psychic

MARCH/APRIL 2006 www.computer.org/intelligent 25 energy, while Thomas Hobbes saw it as a both cases, we need additional criteria to opposition mainly look at word-level oppo- means to emphasize superiority in human decide whether the misunderstanding is sitions (for example, antonyms such as hot competition. In the writings of Immanuel humorous. Criteria that humor researchers versus cold). Inappropriateness hasn’t been Kant, Arthur Schopenhauer, and Henri have mentioned deal with a marked con- formalized at all. Bergson, we can see the first attempts to trast between the obvious interpretation characterize humor as dealing with incon- and the forced reinterpretation, and with Conversational humor: gruity—that is, recognizing and resolving the reinterpretation’s common-sense inap- Constructing humorous acts incongruity. Researchers including Arthur propriateness. As an example, consider the People smile and laugh when someone Koestler, Marvin Minsky, and Alan Paulos following dialogue in a clothing store: uses humor. It’s not necessarily because have tried to clarify these notions, and Vic- Lady: “May I try on that dress in the someone pursues the goal of being funny or tor Raskin and Graeme Ritchie have tried window?” of telling a , but because the conversa- to formally describe them. Clerk: (doubtfully) “Don’t you think it tional partners recognize the possibility of As you might expect, researchers have would be better to use the dressing room?” making a funny remark—deliberately, taken only modest steps toward a formal The first utterance has an obvious inter- spontaneously, or something in between, theory of humor understanding. General pretation. The clerk’s remark is confusing taking into account social display rules. It’s humor understanding and the closely related at first, but looking again at the lady’s utter- possible to look at some relatively simple area of natural-language understanding ance makes it clear that a second interpreta- situations that let us make humorous remarks. require an understanding of rational and tion (requiring a different prepositional These situations fit in the explanations we social intelligence, so we won’t be able to attachment) is possible. This interpretation gave earlier, and they make it possible to solve these problems until we’ve solved all is certainly different, and, most of all, it zoom in on the main problems of humor AI problems. It might nevertheless be bene- understanding: rules to resolve incongruity ficial to look at the development of humor and criteria that help determine whether a theory and possible applications that don’t solution is humorous. require a general theory of humor; this might It might be beneficial to look at Here I talk about surprise disambigua- be the only way to bring the field forward. tions. We can have ambiguities at pragmatic, That is, we expect progress in application the development of humor theory semantic, and syntactic levels of discourse areas—particularly in games and other forms (text, paragraphs, and sentences). At the of entertainment—that require natural inter- and possible applications that sentence level, we can have ambiguities in action between agents and their human part- phrases (for example, prepositional phrase ners, rather than from investigations by a don’t require a general theory of attachment), words, anaphora, and, in the few researchers into full-fledged theories of case of spoken text or dialogue, intonation. computational humor. As we interpret text that we read or hear, Incongruity-resolution theory provides humor; this might be the only way misunderstandings will become clear and be some guidelines for computational humor resolved, maybe with help from our conver- applications. I won’t look at the many vari- to bring the field forward. sational partner. Earlier, I gave an example ants that have been introduced or at details of of ambiguity that occurred because a prepo- one particular approach. Generally, I follow sitional phrase could be attached to a syntac- Graeme Ritchie’s theory;6 however, since I describes a situation that you might con- tic construct (a verb, a noun phrase) in more prefer to look at humorous remarks that are sider inappropriate. than one way. My main line of research, part of the natural interaction between an What can we formalize here, and what however, deals with ambiguities in anaphora ECA and its human conversational partner, formalisms are already available? AI resolution. For example, consider the fol- my starting point isn’t joke telling or pun researchers have introduced scripts and lowing dialogue: making. Rather, I assume a small piece of frames to represent meanings of text frag- Adviser: “Our lawyers put your money discourse consisting of two parts. You read ments. In early humor theory as it relates in little bags, then we have trained dogs or hear and interpret the first part, but as to AI, these knowledge representation for- bury them around town.” you read or hear the second part, it turns malisms were used to intuitively discuss Dilbert: “Do they bury the bags or the out that a misunderstanding has occurred an obvious and a less obvious (or hidden) lawyers?” that requires a new, probably less obvious meaning of a text. A misunderstanding Here, “them” is an anaphor referring interpretation of the text. So, we have an allows at least two frame or script descrip- to a previous noun phrase. You can find obvious interpretation, a conflict, and a tions of the same piece of text; the two antecedents among the noun phrases in the second, compatible interpretation that scripts involved overlap. To make it clear first sentence. resolves the conflict. Although misunder- that the nonobvious interpretation is humor- From a research viewpoint, the advan- standings can be humorous, this isn’t neces- ous, at least some contrast or opposition tage of looking at such a simple, straight- sarily the case. Deliberate misunderstand- between the two interpretations should forward humorous remark is that we can ing sometimes occurs to create a humorous exist. Script overlap and script opposition confine ourselves to just one sentence. So, remark, and it’s also possible to construct a are reasonably well-understood issues, but rather than having to look at scripts, frames, piece of discourse so that it deliberately until now, although often described more and other discourse representations, we can leads to a humorous misunderstanding. In generally, the attempts to formalize this concentrate on the syntactic and semantic

26 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS analysis of just one sentence. For this analy- frequencies and types of anaphora in writ- approach. In particular, the conditions I’ve sis, I use well-known algorithms that trans- ten and spoken text. In particular, we’ve been mentioned (such as contrast and inap- form sentences into feature structure repre- been looking at properties from the anaphora propriateness) might be necessary, but sentations and issues such as script overlap viewpoint of conversations with well-known they’re far from sufficient. Further pinpoint- and script opposition into properties of fea- chat bots such as ALICE, Toni, Eugine, and ing of humor criteria is necessary. My ture sets. Moreover, many algorithms for Jabberwacky. Resources that we’ve investi- approach lets me apply well-known theories anaphora resolution are publicly available. gated are publicly available knowledge from computational linguistics rather than Erroneous anaphora resolution with the aim bases such as WordNet, WordNet Domains, obliging me to continue with archaic of creating a humorous remark can make FrameNet, VerbNet, and ConceptNet. For approaches and concepts. Moreover, my use of properties of possible anaphora example, in VerbNet, every sense of a verb approach makes the issue of humor versus antecedents. is mapped to a verb class representing the nonhumor much more visible than in older Contrast and inappropriateness are global conceptual meaning of this sense. Every approaches. terms from (not yet formalized) humor the- class contains both information on the the- ory. In my approach, determining contrast matic roles associated with the verb class References translates into a heuristic that considers and frame descriptions describing how the a potentially humorous antecedent and verb can be used. This lets you check 1. R. Cowie, “Describing the Emotional States Expressed in Speech,” Proc. Int’l Speech decides to use it because it has many proper- whether a possible humorous antecedent of Communication Assoc. Workshop on Speech ties in common with the correct antecedent. an anaphor sufficiently opposes the correct and Emotion, Queens Univ., 2000, pp. 11–18; However, at least one salient property antecedent because of constraints that it www.qub.ac.uk/en/isca/proceedings/pdfs/ distinguishes the two potential antecedents violates. Unfortunately, because VerbNet cowie.pdf. (a shop window versus a dressing room, a 2. A. Cann, L.G. Calhoun, and J.S. Banks, “On bicycle versus a car, a bag versus a lawyer). the Role of Humor Appreciation in Interper- My approach checks for inappropriateness sonal Attraction: It’s No Joking Matter,” by looking at constraints associated with the Humor: Int’l J. Humor Research, vol. 10, no. verb’s thematic roles in the sentence. For When looking at the fundamental 1, 1997, pp. 77–89. example, these constraints distinguish 3. D. Tannen, Conversational Style: Analyzing between animate and inanimate; hence, problems in humor research, it’s Talk among Friends, Ablex Publishing, 1984. burying lawyers who are alive is inappropri- ate. Obviously, you can say more about this— 4. B. Reeves and C. Nass, The Media Equation, more fruitful to investigate humor Cambridge Univ. Press, 1996. lawyers are easier targets for than phar- macists, for instance—but such observations itself and see whether solutions 5. H. Prendinger and M. Ishizuka, eds., Life-Like are beyond this paper’s scope. Characters: Tools, Affective Functions, and find useful applications. Applications, Springer, 2004. Experiments and 6. G. Ritchie, “Developing the Incongruity-Res- implementation olution Theory,” Proc. AISB Symp. Artificial My research group is creating a chat bot Intelligence and Creative Language: Stories that implements my approach to humorous and Humor, Univ. of Sussex, 1999, pp. 69–75. anaphora resolution. One reason we chose only contains about 4,500 verbs, many sen- 7. S. Lappin and H.J. Leass, “An Algorithm for a chat bot is that its main task is to get a tences can’t be analyzed. Pronominal Anaphora Resolution,” Compu- conversation going. Hence, it might miss tational Linguistics, vol. 20, no. 4, 1994, pp. opportunities to make humorous remarks, Conclusions and future 535–156. and when an intended humorous remark research turns out to be misplaced, this isn’t neces- As I mentioned earlier, when looking at the sarily a problem. Implemented algorithms fundamental problems in humor research, for anaphora resolution are available. We we must wait until the main problems in AI Automatic Production of chose a Java implementation (JavaRAP) of have been solved and then apply the results Humorous Expressions for the well known Lappin and Leass’ Resolu- to humor understanding. It’s more fruitful Catching the Attention and tion of Anaphora Procedure.7 We obtained to investigate humor itself and see whether Remembering a more efficient implementation by replac- solutions that are far from complete and Oliviero Stock and Carlo Strapparava, ing the embedded natural language parser perfect can nevertheless find useful appli- ITC-irst with a parser from Stanford University. Cur- cations. In games and entertainment com- rently, we and other researchers are design- puting, natural interaction with ECAs Humor is essential to communication. It ing experiments to find ways to deal with requires humor modeling. Although many relates directly to themes such as entertain- anaphora-resolution algorithms’ low suc- forms of humor don’t fit into my frame- ment, fun, emotions, aesthetic pleasure, cess rate and to consider the introduction of work of humorous misunderstandings, I motivation, attention, and engagement, a reliability measure before proceeding with think it’s a useful approach. which many people in the field of intelli- possible antecedents of an anaphor. Other Current humor research has many short- gent user interfaces believe are fundamen- issues we’re investigating are the different comings, which are also present in my tal for future computer-based systems. Fur-

MARCH/APRIL 2006 www.computer.org/intelligent 27 thermore, humans probably can’t survive and in providing motivational support. particular, they showed that classification without humor. Computational humor has techniques are a viable approach for distin- the potential to turn computers into extraor- Background guishing between humorous and nonhumor- dinarily creative and motivational tools. So, Although researchers have been study- ous text, through experiments performed on human-computer interaction must evolve ing humor for a long time, including recent very large data sets. beyond usability and productivity. Even approaches in linguistics1 and psychology, though humor is complex to reproduce, it’s to date they have only made limited contri- Applied scenarios: Humorous realistic to model some types of humor butions toward the construction of compu- advertisements and headlines production and to aim at implementing this tational humor prototypes. Almost all the Humor is an important way to communi- capability in computational systems. approaches try to deal with incongruity the- cate new ideas and change perspectives. On ory at various levels of refinement. Incon- the cognitive side, humor has two important Humor, emotions, beliefs, and gruity theory focuses on the element of sur- properties: creativity prise, stating that a conflict between what Humor is a powerful generator of emo- you expect and what actually occurs creates • It helps get and keep people’s attention. tions. As such, it affects people’s psycho- humor. This accounts for much of humor’s The type and rhythm of humor can vary, logical states, directs their attention, influ- most obvious features: ambiguity or double and the time involved in building the ences memorization and decision making, meaning. effect can be different in different cases. and creates desires and emotions. Actually, Kim Binsted and Graeme Ritchie made Sometimes the context—such as joke emotions are an extraordinary instrument one of the first attempts at automatic humor telling—leads you to expect a humorous for motivation and persuasion because those production.2 They devised a formal model of climax, which might only occur after a who are capable of transmitting and evoking long while. Other times the effect occurs them have the power to influence other peo- in almost no time, with one perceptive ple’s opinions and behavior. Humor, there- act—for instance, in static visual humor, fore, allows for conscious and constructive Although researchers have been funny posters, or in cases when some use of the affective states it generates. Affec- well-established convention is reversed tive induction through verbal language is studying humor for a long time, with an utterance. particularly interesting, and humor is one of • It helps memory. For instance, we com- the most effective ways of achieving it. Pur- to date they have only made monly connect knowledge we’ve acquired poseful use of humorous techniques enables to a humorous remark or event in our us to induce positive emotions and mood limited contributions toward the memories. In learning a foreign lan- and to exploit their cognitive and behavioral guage, an involuntary funny situation effects. might occur because of so-called “false Humor acts not only on emotions, but construction of computational friends”—words that sound similar in also on human beliefs. A joke plays on the two languages and might have the same hearer’s beliefs and expectations. By infring- humor prototypes. origin but have very different meanings. ing on them, it causes surprise and then The humorous experience is conducive hilarity. Jesting with beliefs and opinions, to remembering the word’s correct use. humor induces irony and helps people not the semantic and syntactic regularities under- to take themselves too seriously. Some- lying some types of puns. They then imple- No wonder humor has become a favorite times simple wit can sweep away a nega- mented the model in a system called JAPE strategy for creative people in the advertis- tive outlook that limits people’s desires and (Joke Analysis and Production Engine) that ing business. In fact, among fields that use abilities. Wit can help people overcome could automatically generate amusing puns. creative language, advertising is probably self-concern and pessimism that can pre- The goal of our HAHAcronym project3 the area in which creativity (and hence vent them from pursuing more ambitious was to develop a system that could automat- humor) has the most precise objectives. goals and objectives. ically generate humorous versions of exist- From an applied AI viewpoint, we believe Humor encourages creativity as well. ing acronyms or produce new, amusing that an environment for proposing solutions The change in perspective that humorous acronyms constrained to be valid vocabu- to advertising professionals can be a realistic situations create induces new ways of inter- lary words, starting with user-provided practical application of computational preting an event. By making fun of clichés concepts. The system achieved humorous humor and a useful first attempt at dealing and stressing their inconsistency, people effects mainly on the basis of incongruity. with creative language. become more open to new ideas and view- For example, it turned IJCAI—International Some examples of the huge array of points. Creativity redraws the space of pos- Joint Conference on — opportunities that language offers and that sibilities and delivers unexpected solutions into Irrational Joint Conference on Antenup- existent natural-language-processing tech- to problems. Actually, creative stimuli con- tial Intemperance. niques can cope with include rhymes, stitute one of the most effective impulses Humor recognition has received less atten- wordplay, popular sayings or proverbs, quo- for human activity. Machines equipped with tion. Rada Mihalcea and Carlo Strapparava tations, alliteration, triplets, chiasmus, neol- humorous capabilities will be able to play an investigated the application of text-catego- ogism, non sequitur, adaptation of existing active role in inducing emotions and beliefs rization techniques to humor recognition.4 In expressions, adaptation of proverbs, person-

28 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS ification, and synaesthesia (two or more apparently neutral ones can evoke pleasant Using the mechanism to produce senses combined). or painful experiences. Although some humor The humorous variation of newspaper words have emotional meanings with In most AI fields, researchers have under- headlines is an important research direc- respect to individual stories, for many oth- stood the difficulties of reasoning on deep tion, given the particular kinds of linguistic ers, the affective power is part of the collec- world knowledge for a while. A clear prob- phenomena they display. Indeed, it’s possi- tive imagination (for example, “mom,” lem exists in scaling up from experiments ble to exploit their elliptic nature, specific “ghost,” and “war”). to meaningful large-scale applications. syntax, and deliberate use of rhetorical So, measuring a generic term’s affective This is even more obvious in areas such as devices to create funny variations, leverag- power is important. To this end, we studied humor, where good quality requires a sub- ing lexical and syntactic ambiguity (for the use of words in textual productions, and tle understanding of situations and is nor- example, “Marijuana issues sent to joint in particular their co-occurrences with the mally the privilege of talented individuals. committee”5). words having explicit affective meanings. In this sense, we can refer to computational We must distinguish between words directly humor as an AI-complete problem. Our Resources and processing referring to emotional states (such as “fear” goal is to produce general mechanisms for Our work aims at applicability in an unre- and “cheerful”) and those that indirectly humorous revisitation of verbal stricted domain (the application is nonethe- refer to emotion depending on context (for expressions, to work in unrestricted less limited in scope; no general mechanism example, words that indicate possible emo- domains. encompasses all kinds of humor). Our tional causes such as “monster” or emo- Our approach can be synthesized in the approach relies on standard, non-humor- tional responses such as “cry”). We call the following elements. oriented resources (with some extensions former direct affective words and the latter and modifications) and the development of • We tend to start from existing material— specific algorithms that implement and elab- for instance, well-known expressions or orate suitable linguistic humor theories. newspaper headlines—and produce some All words can potentially convey optimal innovation characterized by Resources irony or another humorous connotation. For example, in developing HAHA- affective meaning; even • The variation is based on reasoning on cronym, we used specialized thesauri, repos- various resources, such as those we itories, and in particular WordNet Domains, apparently neutral ones can evoke described earlier. No deep representation an extension developed at ITC-irst of the of meaning and pragmatics exists at the well-known English WordNet. WordNet pleasant or painful experiences. complex-expression level. However, there Domains annotates synsets (that is, synonym is more detailed representation at the indi- sets) with subject field codes (or domain vidual-component level, permitting some labels)—“Medicine,” “Architecture,” and so Many words’ affective power is type of reasoning (an example is the affec- on. For HAHAcronym, we modeled an inde- tive lexical resources indicated earlier pendent structure of domain opposition, such part of the collective imagination. where lexical reasoning is strongly as “Religion” vs. “Technology,” “Sex” vs. involved). For more complex expressions, “Religion,” and so on as a basic resource for we adopt more opaque but dynamic and the incongruity generator. indirect affective words. robust mechanisms based on learning and In our current work, we use a pronuncia- We developed an affective similarity similarity, such as the one we mentioned tion dictionary, lexical knowledge bases mechanism6 that consists of earlier. The potential is particularly strong including WordNet and WordNet Domains, in dealing with new material, such as WordNet-Affect (an affective lexical • the organization of direct affective words reacting to novel expressions in a dialogue resource based on an extension of Word- and synsets inside WordNet-Affect and or, as we are working on now, making fun Net), a grammar for acronyms, a repository • a selection function (called affective of fresh headlines. of commonsensical sentences, a database weight) based on a semantic similarity • The system architecture accommodates of proverbs and clichés, and idioms. Algo- mechanism. This similarity mechanism numerous resources and revision mecha- rithms for making use of these resources is automatically acquired in an unsuper- nisms. The coordination of the various include traditional components such as vised way from a large corpus of texts modules is based on a general optimal parsers, morphological analyzers, and spe- (hundreds of millions of words) to indi- innovation principle and numerous spe- cific reasoners. viduate the indirect affective lexicon. cific humor heuristic mechanisms inspired by the General Theory of Verbal Humor.1 Creating the affective similarity For example, if we input the verb “shoot” mechanism with negative valence, the system individu- We’re working on postprocessing based An important component to include in ates the emotional category of horror. Then on mechanisms for automatic humor evalu- such algorithms, especially for the kind of it extracts the target-noun “gun” and the ation; this is an evolution of humor recog- expressions we’re dealing with, is the affec- causative evaluative adjective “frightening” nition work we mentioned in the “Back- tive similarity mechanism. All words can and finally generates the noun phrase ground” section. Right now, however, our potentially convey affective meaning; even “frightening gun.” system produces results in the form of a set

MARCH/APRIL 2006 www.computer.org/intelligent 29 of proposed expressions. The choice among inhibit the development of language skills, real user interface, and the ordinary user them is left to the human user; few things leading to a lack of conversational fluency couldn’t control it. We’ve used JAPE’s cen- are worse than poor humor, so it’s better not or even an undermining of social skills. tral ideas to create a fully engineered, large- to risk it. Our STANDUP (System To Augment Non- scale, interactive riddle generator with a speakers’ Dialogue Using Puns) project user interface specially aimed at children References aims to take a small step toward alleviating with disabilities. this problem by providing, in software, a 1. S. Attardo, Linguistic Theory of Humor, Mou- language playground for children with dis- Designing with users ton de Gruyter, 1994. abilities. The program provides an interac- As computational humor is still in its 2. K. Binsted and G. Ritchie, “Computational tive user interface, specially designed for infancy, we had no precedent for real-world Rules for Punning Riddles,” Humor: Int’l J. children with limited motor skills, through use of a system such as STANDUP and cer- Humor Research, vol. 10, no. 1, 1997, pp. which a child can create simple jokes (rid- tainly no experience of providing a joke 25–76. dles based on puns) by selecting words or generator for children with disabilities. We 3. O. Stock and C. Strapparava, “Getting Serious topics. Here are two typical jokes that the therefore devoted a substantial portion of about the Development of Computational system produces: the project to user-centered design. We Humor,” Proc. 18th Int’l Joint Conf. Artificial consulted potential users and associated Intelligence (IJCAI 03), Morgan Kaufmann, • What kind of berry is a stream? A cur- experts (teachers and speech and language 2003, pp. 59–64. rent currant. therapists) about how the system, particu- 4. R. Mihalcea and C. Strapparava, “Laughter • How is an unmannered visitor different larly the user interface, should operate.3,4 Abounds in the Mouths of Computers: Inves- from a beneficial respite? One is a rude In the early stages, we used nonsoftware tigations in Automatic Humor Recognition,” mock-ups. We showed users laminated Proc. Intelligent Technologies for Interactive Entertainment (INTETAIN 05), LNCS 3814, sheets representing screen configurations Springer, 2005, pp. 84–93. and asked them to step through tasks by The program provides an inter- pointing to the buttons in the pictures. The 5. C. Bucaria, “Lexical and Syntactic Ambigu- experimenter responded by replacing each ity as a Source of Humor,” Humor: Int’l J. active user interface, specially sheet with the appropriate next screen shot. Humor Research, vol. 17, no. 3, 2004, pp. 279–309. We adopted this low-tech approach to designed for children with limited emphasize to the participants that the sys- 6. A. Valitutti, C. Strapparava, and O. Stock, tem hadn’t yet been built and that sugges- “Lexical Resources and Semantic Similarity tions or criticisms at this stage could gen- for Affective Evaluative Expressions Gener- motor skills, through which a child ation,” Proc. 1st Int’l Conf. Affective Com- uinely influence the software’s eventual puting and Intelligent Interaction (ACII 05), can create simple jokes by design. Past experience had shown that LNCS 3784, Springer, 2005, pp. 474–481. software mock-ups, particularly if very slick, give the impression that a working selecting words or topics. program is already available. This discour- ages participants from asking for changes The STANDUP Interactive Riddle- and can even distract them into asking how Builder guest, the other is a good rest. they can get hold of this apparently work- Graeme Ritchie, University of Aberdeen ing program. Ruli Manurung and Helen Pain, The system isn’t just an online jokebook. Based on our studies’ results, we built a University of Edinburgh It builds new jokes on the spot using 10 sim- software mock-up of the user interface (with Annalu Waller and Dave O’Mara, ple patterns for the essential shapes of pun- a dummy joke generator) and tested it for University of Dundee ning riddles and a lexical database of about usability with suitable users. Teachers and 130,000 words and phrases. We hope chil- therapists again gave their advice. As children grow up, their language and dren will enjoy using the software to experi- In parallel with this, we designed and communication skills develop as a result of ment with sounds and meanings to the bene- implemented the joke-generating back end. their experience in the world and their inter- fit of their linguistic skills. This incorporated a wide variety of facili- actions with other language users. In many ties for manipulating words and joke struc- cultures, an important type of interaction is Background tures, but our studies with users and experts language play with the child’s peer group— Although various researchers have suggested that this particular user group word games and joke-telling. A child with a attempted since 1992 to get computers to needed only a subset of these. disability such as cerebral palsy might con- produce novel jokes,1 STANDUP’s main pre- verse using a voice output communication decessor is the JAPE system.2 That pro- How the system works aid—a speech synthesizer attached to a gram could churn out hundreds of punning You can view the STANDUP program as physical input device. This cumbersome riddles, some of which children judged to having two relatively separate major parts: way of talking tends to isolate a child from be of reasonable quality. However, it was the front end, which embodies the user inter- the repartee, banter, and joke-telling typical only a rough research prototype; it took a face and controls any user options, and the of the playground. This lack of practice can long time to produce results and had no back end, which manages the lexical data-

30 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS base and generates the jokes. The user interface displays three main areas: the general navigation bar, the joke- selection menu, and the progress chart (see figure 3). The navigation bar is a standard set of buttons for going back, forward, exit- ing, and so on. The joke-selection menu consists of large labeled buttons through which the user controls the joke generator using a standard mouse, a touch-screen, or a single-switch scanning interface (rou- tinely adopted for those with limited motor skills). The progress chart shows where the user is in creating or finding a joke, using the metaphor of a bus journey along a sim- ple road network, with stops such as “Word Shop” and “Joke Factory.” The back end (see figure 4) contains several components: a set of schemas, a set of description rules, a set of text templates, and a dictionary. The schemas define the linguistic patterns underlying punning rid- dles. For example, a schema might repre- Figure 3. The STANDUP user interface. sent information such as:

Find items X, Y in the dictionary such that X is a noun or adjective, Y is a noun, Schemas and X and Y sound the same. Choose related items for answer

This describes the central relationships in User constraints an example like the current/currant joke Dictionary shown earlier. Choose appropriate items The schema contains information that for question mainly constrains the ingredients of the riddle’s answer, because that’s where the Description rules pun occurs in all the types of riddle we’ve Insert used. Once the system finds suitable items items into textual Text templates to match a schema, it passes these dictionary forms entries on to the description rules, which flesh out the descriptive phrases needed for the question. For example, starting from “rude” and “guest,” it might try to find a way to Figure 4. The structure of the STANDUP back end. build a phrase describing or meaning the same as “rude guest,” such as “unmannered visitor,” or it might select two items that building (the description rules), or both. we used the Unisyn pronunciation dictio- can be “crossed” to produce “rude guest,” For example, the user might specify that nary to turn a word or phrase’s ordinary such as “boor” and “visitor.” the joke must contain a particular word, be spelling into a standard phonetic notation. For the third stage, text templates contain on a particular topic, or be a particular type For our final user trials, we attached pic- canned strings such as “What do you get of riddle. tures to as many of the words as possible, when you cross a ... and a …” or “What do Joke creation depends heavily on the dic- using proprietary symbol sets that two com- you call a ...” alongside labeled slots into tionary, which contains information from panies involved in creating software aids for which the template-handling software slots numerous sources in a relational database. disabled users lent to us. The resulting lexi- the words and phrases provided by the We took syntactic categories (such as noun cal database of around 130,000 entries con- schemas and the description rules, thus pro- and verb) and semantic relations (synonymy, tains several tables, each representing one ducing the final text. being a subclass of, and so on) from the important relationship (such as between a The user can control this process via the public-domain lexicon WordNet,5 which word and its pronunciation or a word and its graphical interface by imposing constraints contains approximately 200,000 entries. synonyms). In this way, we were able to on answer-building (the schema), question- For information about the sound of words, implement dictionary searches as SQL data-

MARCH/APRIL 2006 www.computer.org/intelligent 31 How will children use it? The Authors We’re about to start our final evaluations of the full system with users. This will Kim Binsted is an Carlo Strapparava is a involve visiting schools in the surrounding assistant professor in senior researcher at Information and Com- ITC-irst in the Commu- area, both special-needs establishments and puter Sciences at the nication and Technolo- mainstream schools. There we shall see University of Hawaii. gies Division. Contact how children—both with and without lan- Contact her at binsted@ him at [email protected]. guage-impairing disabilities—use the sys- hawaii.edu. tem. Beforehand, we’ll assess certain aspects of each child’s literacy to give us a Benjamin Bergen is Graeme Ritchie is a context for interpreting what we observe. an assistant professor senior research fellow The time available to us (a few months) in the Linguistics in the University of isn’t sufficient for a long-term study of the Department at the Aberdeen’s Department software’s effects. However, we’ll carry out University of Hawaii, of Computing Science. Manoa, where he Contact him at gritchie@ some tests at the end of a child’s heads the Language csd.abdn.ac.uk. exploration of the system to see if the ses- and Cognition Labo- sions have been beneficial in any way. ratory. Contact him at This project is very much an exploration [email protected]. Ruli Manurung is a research fellow in the of possibilities, and we don’t know what Seana Coulson is an University of Edin- we’ll find out. However, we hope that the associate professor in burgh’s School of Infor- work will help move computational humor the Cognitive Science matics. Contact him at from tentative research to practical applica- Department at the [email protected]. tions. In particular, a software language University of Califor- uk. nia, San Diego, where playground like this could well be of wider she heads the Brain use in educational settings. and Cognition Labo- Helen Pain is a senior ratory. Contact her at lecturer in the Univer- [email protected] sity of Edinburgh’s .edu. School of Informatics. Contact her at h.pain@ Anton Nijholt is a ed.ac.uk. Acknowledgments professor and chair of human-media interac- Grants GR/S15402/01 and GR/R83217/01 tion in the University from the UK’s Engineering and Physical Sci- of Twente’s Depart- Annalu Waller is a ences Research Council supported this work. ment of Computer lecturer in the Univer- We’re grateful to Widgit Software Ltd. and Science. Contact him sity of Dundee’s Divi- Mayer-Johnson LLC for their help. at [email protected]. sion of Applied Com- nl. puting. Contact her at awaller@computing. Oliviero Stock is a dundee.ac.uk. References senior fellow at and former director of 1. G. Ritchie, The Linguistic Analysis of Jokes, ITC-irst (Center for David O’Mara is a Routledge, 2004. Scientific and Tech- postdoctoral research nological Research). assistant in the Univer- 2. K. Binsted, H. Pain, and G. Ritchie, “Chil- Contact him at sity of Dundee’s Divi- dren’s Evaluation of Computer-Generated [email protected]. sion of Applied Com- Punning Riddles,” Pragmatics and Cognition, puting. Contact him at vol. 5, no. 2, 1997, pp. 309–358. domara@computing. dundee.ac.uk. 3. R. Manurung et al., “Facilitating User Feed- back in the Design of a Novel Joke Genera- tion System for People with Severe Commu- nication Impairment,” Proc. 11th Int’l Conf. Human-Computer Interaction (HCI 05), CD- base queries. mate as much of the dictionary creation as ROM, Lawrence Erlbaum, 2005. Although the joke-generation ideas are possible, so that it should be relatively easy 4. D. O’Mara et al., “The Role of Assisted Com- relatively simple and have already been to create revised versions of the dictionary municators as Domain Experts in Early Soft- tested in principle in the JAPE project, (for example, from a new version of Word- ware Design,” Proc. 11th Biennial Conf. Int’l designing and implementing a large-scale, Net). We also hope to make the lexical data- Soc. for Augmentative and Alternative Com- efficient, robust, easily usable system base available to other projects. munication, CD-ROM, ISAAC, 2004. involved considerable work. We tried to make 5. C. Fellbaum, WordNet: An Electronic Lexical our designs as general as possible and to auto- Database, MIT Press, 1998.

32 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS