Computational Humor [email protected] Kim Binsted, University of Hawaii
Total Page:16
File Type:pdf, Size:1020Kb
Trends & Controversies Steven Staab University of Karlsruhe Computational Humor [email protected] Kim Binsted, University of Hawaii No, this is no April Fool’s prank. Computer scientists at labs around the world are conducting serious research into … humor. Although it might seem whimsical, many excellent reasons exist to take a closer look at this fascinating aspect of human cognition and interaction. cally addressed language phenomena that are amenable to Humor affects attention and memory, facilitates social interaction, combinatorial approaches using static and stereotypical and ameliorates communication problems. If computers are ever semantic representations. Although such approaches are going to communicate naturally and effectively with humans, they adequate for much of language, they’re not easily extended must be able to use humor. Moreover, humor provides insight into to capture humans’ more creative language interpretation how humans process language—real, complex, creative language, capacities. An alternative tack is to begin by modeling less not just a tractable subset of standard sentences. By modeling humor typical, more complex phenomena, with the goal of generation and understanding on computers, we can gain a better encompassing standard language as a trivial case.1 picture of how the human brain handles not just humor but language and cognition in general. Semantic interpretation In popular perception, computers will never be able to use or Suppose you hear someone saying, “Everyone had so appreciate humor. Fictional computers and robots are almost always much fun diving from the tree into the swimming pool, we portrayed as humorless, even if they’re skilled at natural language, decided to put in a little … ” At the point in the sentence graceful bipedal motion, and other human behaviors that current where you hear the words “put in,” you’ve already com- machines find challenging. Then again, chess too was once thought mitted to an interpretation of the clause—probably that the to be the sole domain of humans, and now computer programs play owners are installing a diving board. Indeed, psycholin- at the grandmaster level. guistic research suggests that human sentence processing These four articles focus on different aspects and applications of is both incremental and predictive, because people inte- humor. Benjamin Bergen and Seana Coulson propose a model of grate perceptual input with linguistic and conceptual infor- humor comprehension based on frame-shifting within a simulation- mation at multiple levels of representation.2 Neuroimaging based natural-language-understanding system. Anton Nijholt research with magnetoencephalography suggests that after describes how embodied agents can use humor to make human- an initial modality-specific processing stage of approxi- agent interaction more effective and believable. Oliviero Stock and mately 200 milliseconds, speech processing is subserved Carlo Strapparava describe humor’s effects on attention and mem- by a bilateral network of inferior prefrontal and temporal ory and describe an application of computational humor to adver- lobe regions that are simultaneously active for hundreds of tising. Finally, Graeme Ritchie, Ruli Manurung, Helen Pain, Annalu milliseconds.3 Such data suggest that the processes of lexi- Waller, and Dave O’Mara describe an interactive riddle builder cal access, semantic association, and contextual integration called Standup, which helps children with communication-related are simultaneous, as represented in cascade models. disabilities use humor to interact with other children. One day, per- Besides its empirical motivation, incremental semantic haps, thanks to the groundbreaking research these articles describe, processing has computational benefits for word recogni- computers will be able to devise their own April Fool’s pranks. tion and contextual integration. Word recognition in nat- ural speech, for example, is extremely challenging because of extensive variability in the acoustic input. Top-down semantic information greatly facilitates segmentation of Frame-Shifting Humor in Simulation- the sound stream. Moreover, spoken language is produced Based Language Understanding quickly—typically at about two to three words per sec- Benjamin Bergen, University of Hawaii, Manoa ond—which necessitates parallel processing of linguistic Seana Coulson, University of California, San Diego cues and their semantic referents. Besides easing word recognition, activating the correct frame of reference greatly In an effort to focus on tractable problems, computa- facilitates contextual integration. For instance, in the swim- tional natural-language-understanding systems have typi- ming pool example, knowledge of diving and the typical 22 22 1541-1672/06/$20.001094-7167/03/$17.00 © © 2003 2006 IEEE IEEE IEEE INTELLIGENT SYSTEMS Published by the IEEE Computer Society backyard pool lets you more easily recog- nize and integrate expected items, such as a Form Meaning diving board, into the scenario. Phonological schemasConstructions Conceptual schemas However, one thing that differentiates human sentence processing from many com- Utterance Communicative context putational systems is humans’ capacity to deal with unexpected input, even when it Analysis necessitates revising information that they’ve already assembled. People, for example, are readily able to interpret the sentence “Every- Semantic specification Simulation Inferences one had so much fun diving from the tree into the swimming pool, we decided to put in a little water.” The word “water” prompts you to revise the default assumption that the pool had water in it. Revising this simple assump- Figure 1. Overview of the Embodied Construction Grammar architecture with two core processes, analysis and simulation.2 tion has substantial implications for the con- sequences of diving from the tree into the pool, and for the mindset of those who enjoy such activities. This reanalysis process, Prepare Launch Flight Enabled Ready Ongoing Done known as frame-shifting, highlights the need Move Propel Position for dynamic inferencing in language inter- into using body for pretation and has been argued to be a test position legs flight and landing case for models of meaning construction.1 Natural-language-understanding sys- tems might achieve the flexible, interpre- tive capacity necessary for frame-shifting Prepare Launched by adopting an empirically inspired archi- for towards tecture that is based on dynamic internal launch goal imagery. In such models, language inter- Propel pretation involves creating an internal sim- towards ulation of events that includes the sensory, goal motor, and affective dimensions. Research suggests that the neural systems responsi- Figure 2. A simplified Dive X-schema. ble for performing actions or perceiving percepts are also recruited for linguistically inspired simulations. In other words, like ECG language-understanding system has complex action, will be a particular instantia- dreaming, recalling, and imagining, lan- two main processes: analysis and simulation tion of this general X-schema. In the case of guage processing uses human perceptual (see figure 1). The analysis system parses diving, instances will differ in the agent who and motor systems as internal models that each input utterance into constituent words performs the dive, the dive’s target, the force allow for the construction of subjective and syntactic structures, using a representa- exerted, the trajectory, and so on. The lan- experiences in the absence of motor action tion of the current communicative context as guage system must supply these parame- or perceptual input.4 Because our experi- well as stored pairings of phonological and ters of the Dive X-schema, but it can use ences with pools almost without exception conceptual knowledge known as construc- default values in their absence. The simula- include water, water will automatically be tions. The simulation system runs dynamic tion system depends on the analysis system activated in mental simulations that involve simulations of those utterances’ content, to provide it with information about which pools. Our experience with diving, by con- producing inferences and updating beliefs X-schemas to run, what parameterizations trast, presumably involves some cases of about the world. to assign to them, and how to bind them landing on a solid surface, thus enabling us The ECG simulation system uses a rep- together. to viscerally imagine diving into a pool resentational formalism known as the X- The analysis system uses linguistic input with no water. schema. X-schemas are based on stochastic to produce a specification of the mechanisms petri nets and have numerous desirable prop- the simulation system uses. This parameter- Simulation-based natural- erties. They are dynamic and probabilistic ized interface is known as the semantic spec- language-understanding and allow for both parallel processing and ification (see figure 1). The inputs to the systems hierarchical structure. Figure 2 shows a analysis system are contextually activated One natural-language-understanding sys- simplified representation of the Dive X- schemas, along with the words and other tem that employs a simulation-based archi- schema—that is, the machinery used to vir- constructions the utterance activates; con- tecture was developed under the rubric of tually perform or mentally simulate diving. structions map aspects of form to aspects of Embodied Construction Grammar.5