kim binsted - computational humor

11
MARCH/APRIL 2006 1541-1672/06/$20.00 © 2006 IEEE 59 Published by the IEEE Computer Society Trends & Controversies Frame-Shifting Humor in Simulation- Based Language Understanding Benjamin Bergen, University of Hawaii, Manoa Seana Coulson, University of California, San Diego In an effort to focus on tractable problems, computa- tional natural-language-understanding systems have typi- cally addressed language phenomena that are amenable to combinatorial approaches using static and stereotypical semantic representations. Although such approaches are adequate for much of language, they’re not easily extended to capture humans’ more creative language interpretation capacities. An alternative tack is to begin by modeling less typical, more complex phenomena, with the goal of encom- passing standard language as a trivial case. 1 Semantic interpretation Suppose you hear someone saying, “Everyone had so much fun diving from the tree into the swimming pool, we decided to put in a little … ” At the point in the sentence where you hear the words “put in,” you’ve already com- mitted to an interpretation of the clause—probably that the owners are installing a diving board. Indeed, psycholin- guistic research suggests that human sentence processing is both incremental and predictive, because people inte- grate perceptual input with linguistic and conceptual infor- mation at multiple levels of representation. 2 Neuroimaging research with magnetoencephalography suggests that after an initial modality-specific processing stage of approxi- mately 200 milliseconds, speech processing is subserved by a bilateral network of inferior prefrontal and temporal lobe regions that are simultaneously active for hundreds of milliseconds. 3 Such data suggest that the processes of lexi- cal access, semantic association, and contextual integration are simultaneous, as represented in cascade models. Besides its empirical motivation, incremental semantic processing has computational benefits for word recogni- tion and contextual integration. Word recognition in nat- ural speech, for example, is extremely challenging because of extensive variability in the acoustic input. Top-down semantic information greatly facilitates segmentation of the sound stream. Moreover, spoken language is produced quickly—typically at about two to three words per sec- ond—which necessitates parallel processing of linguistic cues and their semantic referents. Besides easing word recognition, activating the correct frame of reference greatly facilitates contextual integration. For instance, in the swim- ming pool example, knowledge of diving and the typical backyard pool lets you more easily recognize and inte- Computational Humor Kim Binsted, University of Hawaii No, this is no April Fool’s prank. Computer scientists at labs around the world are conducting serious research into … humor. Although it might seem whimsical, many excellent reasons exist to take a closer look at this fascinating aspect of human cognition and interaction. Humor affects attention and memory, facilitates social interaction, and ameliorates communication problems. If computers are ever going to communicate naturally and effectively with humans, they must be able to use humor. Moreover, humor provides insight into how humans process language—real, complex, creative language, not just a tractable subset of standard sentences. By modeling humor generation and understanding on computers, we can gain a better picture of how the human brain handles not just humor but language and cognition in general. In popular perception, computers will never be able to use or appre- ciate humor. Fictional computers and robots are almost always por- trayed as humorless, even if they’re skilled at natural language, graceful bipedal motion, and other human behaviors that current machines find challenging. Then again, chess too was once thought to be the sole domain of humans, and now computer programs play at the grandmaster level. These four articles focus on different aspects and applications of humor. Benjamin Bergen and Seana Coulson propose a model of humor comprehension based on frame-shifting within a simulation- based natural-language-understanding system. Anton Nijholt describes how embodied agents can use humor to make human- agent interaction more effective and believable. Oliviero Stock and Carlo Strapparava describe humor’s effects on attention and mem- ory and describe an application of computational humor to adver- tising. Finally, Graeme Ritchie, Ruli Manurung, Helen Pain, Annalu Waller, and Dave O’Mara describe an interactive riddle builder called Standup, which helps children with communication-related disabili- ties use humor to interact with other children. One day, perhaps, thanks to the groundbreaking research these articles describe, com- puters will be able to devise their own April Fool’s pranks. —Kim Binsted

Upload: katerinarazgonova

Post on 25-Dec-2015

215 views

Category:

Documents


1 download

DESCRIPTION

Kim Binsted - Computational Humor

TRANSCRIPT

Page 1: Kim Binsted - Computational Humor

MARCH/APRIL 2006 1541-1672/06/$20.00 © 2006 IEEE 59Published by the IEEE Computer Society

T r e n d s & C o n t r o v e r s i e s

Frame-Shifting Humor in Simulation-Based Language UnderstandingBenjamin Bergen, University of Hawaii, ManoaSeana Coulson, University of California, San Diego

In an effort to focus on tractable problems, computa-tional natural-language-understanding systems have typi-

cally addressed language phenomena that are amenable tocombinatorial approaches using static and stereotypicalsemantic representations. Although such approaches areadequate for much of language, they’re not easily extendedto capture humans’ more creative language interpretationcapacities. An alternative tack is to begin by modeling lesstypical, more complex phenomena, with the goal of encom-passing standard language as a trivial case.1

Semantic interpretationSuppose you hear someone saying, “Everyone had so

much fun diving from the tree into the swimming pool, wedecided to put in a little … ” At the point in the sentencewhere you hear the words “put in,” you’ve already com-mitted to an interpretation of the clause—probably that theowners are installing a diving board. Indeed, psycholin-guistic research suggests that human sentence processingis both incremental and predictive, because people inte-grate perceptual input with linguistic and conceptual infor-mation at multiple levels of representation.2 Neuroimagingresearch with magnetoencephalography suggests that afteran initial modality-specific processing stage of approxi-mately 200 milliseconds, speech processing is subservedby a bilateral network of inferior prefrontal and temporallobe regions that are simultaneously active for hundreds ofmilliseconds.3 Such data suggest that the processes of lexi-cal access, semantic association, and contextual integrationare simultaneous, as represented in cascade models.

Besides its empirical motivation, incremental semanticprocessing has computational benefits for word recogni-tion and contextual integration. Word recognition in nat-ural speech, for example, is extremely challenging becauseof extensive variability in the acoustic input. Top-downsemantic information greatly facilitates segmentation ofthe sound stream. Moreover, spoken language is producedquickly—typically at about two to three words per sec-ond—which necessitates parallel processing of linguisticcues and their semantic referents. Besides easing wordrecognition, activating the correct frame of reference greatlyfacilitates contextual integration. For instance, in the swim-ming pool example, knowledge of diving and the typicalbackyard pool lets you more easily recognize and inte-

Computational HumorKim Binsted, University of Hawaii

No, this is no April Fool’s prank. Computer scientists at labs aroundthe world are conducting serious research into … humor. Although itmight seem whimsical, many excellent reasons exist to take a closerlook at this fascinating aspect of human cognition and interaction.

Humor affects attention and memory, facilitates social interaction,and ameliorates communication problems. If computers are evergoing to communicate naturally and effectively with humans, theymust be able to use humor. Moreover, humor provides insight intohow humans process language—real, complex, creative language,not just a tractable subset of standard sentences. By modeling humorgeneration and understanding on computers, we can gain a betterpicture of how the human brain handles not just humor but languageand cognition in general.

In popular perception, computers will never be able to use or appre-ciate humor. Fictional computers and robots are almost always por-trayed as humorless, even if they’re skilled at natural language,graceful bipedal motion, and other human behaviors that currentmachines find challenging. Then again, chess too was once thoughtto be the sole domain of humans, and now computer programs playat the grandmaster level.

These four articles focus on different aspects and applications ofhumor. Benjamin Bergen and Seana Coulson propose a model ofhumor comprehension based on frame-shifting within a simulation-based natural-language-understanding system. Anton Nijholtdescribes how embodied agents can use humor to make human-agent interaction more effective and believable. Oliviero Stock andCarlo Strapparava describe humor’s effects on attention and mem-ory and describe an application of computational humor to adver-tising. Finally, Graeme Ritchie, Ruli Manurung, Helen Pain, AnnaluWaller, and Dave O’Mara describe an interactive riddle builder calledStandup, which helps children with communication-related disabili-ties use humor to interact with other children. One day, perhaps,thanks to the groundbreaking research these articles describe, com-puters will be able to devise their own April Fool’s pranks.

—Kim Binsted

Page 2: Kim Binsted - Computational Humor

grate expected items, such as a diving board,into the scenario.

However, one thing that differentiateshuman sentence processing from many com-putational systems is humans’ capacity todeal with unexpected input, even when itnecessitates revising information that they’vealready assembled. People, for example, arereadily able to interpret the sentence “Every-one had so much fun diving from the tree intothe swimming pool, we decided to put in alittle water.” The word “water” prompts youto revise the default assumption that the poolhad water in it. Revising this simple assump-tion has substantial implications for the con-sequences of diving from the tree into thepool, and for the mindset of those who enjoysuch activities. This reanalysis process,known as frame-shifting, highlights the needfor dynamic inferencing in language inter-pretation and has been argued to be a testcase for models of meaning construction.1

Natural-language-understanding systemsmight achieve the flexible, interpretive capac-ity necessary for frame-shifting by adoptingan empirically inspired architecture that isbased on dynamic internal imagery. In suchmodels, language interpretation involvescreating an internal simulation of eventsthat includes the sensory, motor, and affec-tive dimensions. Research suggests that theneural systems responsible for performingactions or perceiving percepts are alsorecruited for linguistically inspired simula-tions. In other words, like dreaming, recall-ing, and imagining, language processinguses human perceptual and motor systemsas internal models that allow for the con-struction of subjective experiences in theabsence of motor action or perceptualinput.4 Because our experiences with poolsalmost without exception include water,water will automatically be activated inmental simulations that involve pools. Our

experience with diving, by contrast, pre-sumably involves some cases of landing ona solid surface, thus enabling us to viscer-ally imagine diving into a pool with nowater.

Simulation-based natural-language-understandingsystems

One natural-language-understandingsystem that employs a simulation-basedarchitecture was developed under therubric of Embodied Construction Gram-mar.5 The ECG language-understandingsystem has two main processes: analysisand simulation (see figure 1). The analysissystem parses each input utterance into con-stituent words and syntactic structures,using a representation of the current com-municative context as well as stored pair-ings of phonological and conceptual knowl-edge known as constructions. The simulationsystem runs dynamic simulations of thoseutterances’ content, producing inferencesand updating beliefs about the world.

The ECG simulation system uses a rep-resentational formalism known as the X-schema. X-schemas are based on stochasticPetri nets and have numerous desirableproperties. They are dynamic and proba-bilistic and allow for both parallel process-ing and hierarchical structure. Figure 2shows a simplified representation of theDive X-schema—that is, the machineryused to virtually perform or mentally simu-late diving. Any individual diving event, aswith any complex action, will be a particularinstantiation of this general X-schema. In thecase of diving, instances will differ in theagent who performs the dive, the dive’s tar-get, the force exerted, the trajectory, and soon. The language system must supply theseparameters of the Dive X-schema, but itcan use default values in their absence. Thesimulation system depends on the analysissystem to provide it with information aboutwhich X-schemas to run, what parameteri-zations to assign to them, and how to bindthem together.

The analysis system uses linguistic inputto produce a specification of the mechanismsthe simulation system uses. This parameter-ized interface is known as the semantic spec-ification (see figure 1). The inputs to theanalysis system are contextually activatedschemas, along with the words and otherconstructions the utterance activates; con-structions map aspects of form to aspects of

60 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

Form Meaning

Inferences

Communicative contextUtterance

Phonological schemas Conceptual schemas

Semantic specification

Analysis

Simulation

Constructions

Figure 1. Overview of the Embodied Construction Grammar architecture with two coreprocesses, analysis and simulation.5

EnabledPrepare

Moveinto

position

ReadyLaunch

Propelusinglegs

Propeltowards

goal

Launchedtowards

goal

Preparefor

launch

OngoingFlight

Positionbody forflight andlanding

Done

Figure 2. A simplified Dive X-schema.

Page 3: Kim Binsted - Computational Humor

meaning. As we noted earlier, the output ofanalysis is the semantic specification. Forexample, the word “diving” maps from asound or character sequence to the X-schemarepresentation for diving (see figure 2),just as the word “tree” will map its sound orspelling to a parameterization of how to simu-late a tree. Larger grammatical constructionsindicate how these schematic, parameterizedrepresentations of simulatable meaning arebound together. Consequently, an Englishspeaker knows that the subject of the verb“dive” will be the agent performing the dive,and that the subsequent prepositional phrasesdescribe the locus of its starting and endingpoints.

The processes in the ECG system areinterdependent: the simulation systemdepends on input from the analysis system,and defective simulation can force a novelanalysis. Moreover, the processes proceedincrementally. Critically, the simulation sys-tem engages as early as possible and neednot wait for the analysis process to terminate.Indeed, as we argued earlier, simulationmust begin as soon as a coherent chunk ofa semantic specification has been assembled.

Computational-humorprocessing

Humans naturally and frequently pro-duce humorous language. So, to interact asa human-like conversational agent or toserve as a scientific model of human-lan-guage understanding, a natural-language-understanding system must be able to dealappropriately with linguistic humor. Earlysimulation is a key feature that facilitateshumor processing in computational natural-language-understanding systems. Here’s anexample of how the ECG system deals withan utterance such as “Everyone had so muchfun diving from the tree into the swimmingpool, we decided to put in a little water.”

A processing exampleFirst, assuming for simplification that

the input is in text form rather than spoken,analysis begins from left to right (startingwith the first word) and hypothesizes lexi-cal and larger linguistic constructions thatcould account for the input data, using achart-parsing algorithm.6 As the system readsin each new word, the analysis process main-tains numerous competing candidate setsof lexical and grammatical constructions.Each coherent candidate set includes notonly these constructions, but also bindings

among them. For instance, the examplesentence activates lexical constructionsrepresenting words—“everyone,” “had,”“so,” and so on—as well as larger gram-matical constructions such as noun phrases(“the swimming pool”) and prepositionalphrases (“from the tree”).

The semantic side of these assemblednetworks of constructions contains preciselythe parameterized schematic representationsthe simulation device needs. Once a singleanalysis surpasses a defined certainty thresh-old, the analysis process reads its semanticspecification into the simulation engine. So,after the first clause, the analysis processpasses the single best analysis into simula-tion. The simulator runs an internal simula-tion of this experience, activating the appro-priate X-schema with the parameterizationsdetermined through analysis, such as whowas performing the action, with what tra-

jectory, and so on. The system fills in thoseaspects of simulation not specified from thelanguage with default or contextually infer-able values, so it represents the pool as filledwith water. The simulation generates appro-priate affective and encyclopedic inferencesso that it can update its beliefs about theworld—people in the scene were using theswimming pool in its normal function, thatis, diving in, getting wet, and enjoying it.

As processing continues, however, thesystem is confronted with a semantic inco-herency. Namely, when it encounters theword “water” at the end of the sentence andsimulates the second clause’s content, theresult is inconsistent with the previous sim-ulation. The highest-level construction inthe sentence, the “X is so Y that Z” con-struction, indicates that the events the twoclauses describe (the enjoyment of the div-ing and the putting in of water) are related.However, unless heavy contextual biasing

exists (for instance, if the system previouslyknows the pool to be full of jello), the simu-lation system can’t coherently maintain thesimulation it constructed for the first clauseas a predecessor of the simulation for thesecond clause; without water in the pool,people can’t have been diving normally intoit. Because the simulation system can’testablish a causal or temporal relationshipbetween the two partial simulations it per-formed for the two clauses, it fails and callson the analysis system to provide a newsemantic specification. The success of asemantic specification that calls for a differ-ent simulation in the first clause—namely, aslightly different X-schema for the firstevent—results in a coherent relationshipbetween the two clauses’ simulations.

Our system’s strengthsAlthough this example is a mere sketch

of the processing that occurs in a simula-tion-based natural-language-understandingsystem, it illustrates several important fea-tures of this approach. First, using detailedsemantic interpretation in the form of simu-lation is vital to correctly identifying seman-tic incoherencies, because nothing about thehumorous utterance’s linguistic form itself—the words it uses or their formal combina-tion—is conceptually discrepant in any way.Second, the early (utterance internal) com-mitment to semantic interpretation predictsdifferences in the processing of humorousutterances that require frame-shifting fromthose that are merely ambiguous. For in-stance, in processing the sentence “Thatbook sounds so great I’m going to diveright into it,” the word “dive” has a differ-ent sense than when it describes physicaldiving. However, in this case, prior linguis-tic context (and simulation) biases theprocessor toward the correct interpretation.Third, a simulation-based model uses real-world knowledge to capture subtle semanticdifferences in the meanings of particularwords that arise in the simulation and not inthe semantic specification. For example,only the simulation can distinguish betweenthe diving you would do into a pool and on afootball field, or between diving into a poolwith versus without water. Finally, theprocess of misunderstanding and correc-tion—committing to a simulation only tobe confronted with an incoherency that needscorrecting—drives the humorous effect theutterance has on humans. A simulation-based architecture exhibits similar behavior.

MARCH/APRIL 2006 www.computer.org/intelligent 61

To interact as a human-like

conversational agent, a natural-

language-understanding system

must be able to deal appropriately

with linguistic humor.

Page 4: Kim Binsted - Computational Humor

ConclusionsAs natural-language-understanding sys-

tems approach human-like conversationalcompetence, a main challenge is dealingwith language that requires deep concep-tual knowledge about the details of humanexperience, as humorous language oftendoes. Incorporating simulation devices intosuch applications can begin to solve the prob-lem of how to use knowledge of the world toimprove human-machine communication.Computational tools developed to deal withthe complexities of linguistic humor mightalso apply to other challenging problemsin natural language interpretation.

References

1. S. Coulson, Semantic Leaps: Frame-Shiftingand Conceptual Blending in Meaning Con-struction, Cambridge Univ. Press, 2001.

2. Y. Kamide, G.T.M. Altmann, and S.L. Hay-wood, “The Timecourse of Prediction inIncremental Sentence Processing: Evidencefrom Anticipatory Eye Movements,” J. Mem-ory and Language, vol. 49, no. 1, 2003, pp.133–156.

3. K. Marinkovic, “Spatiotemporal Dynamicsof Word Processing in the Human Cortex,”Neuroscientist, vol. 10, no. 2, 2004, pp.142–152.

4. L.W. Barsalou, “Perceptual Symbol Sys-tems,” Behavioral and Brain Sciences, vol.22, no. 4, 1999, pp. 577–609.

5. B. Bergen and N. Chang, “Embodied Con-struction Grammar in Simulation-Based Lan-guage Understanding,” Construction Gram-mars: Cognitive Grounding and TheoreticalExtensions, J. Ostman and M. Fried, eds.,John Benjamins, 2005, pp. 147–190.

6. J. Bryant, “Constructional Analysis,” master’sthesis, Computer Science Dept., Univ. of Cal-ifornia, Berkeley, 2004.

Embodied ConversationalAgents: “A Little Humor Too”Anton Nijholt, University of Twente

Social and intelligent agents have becomea leading paradigm for describing and solv-ing problems in human-like ways. In situa-tions where it’s useful to design direct com-munication between agents and their humanpartners, the display of social and rationalintelligence in an embodied human-likeagent (that is, an agent visualized onscreen asan animated character) allows natural inter-

action between the human and the agent thatrepresents the system the human is commu-nicating with. Research in intelligent agentsincludes reasoning about beliefs, desires,and intentions. Apart from contextual con-straints that guide the agent’s reasoning andbehavior, other behavioral constraints existthat follow from models that describe emo-tions. These models assume that emotionsemerge based on appraisals of events takingplace in the environment and how theseevents affect goals that the agents are pursu-ing. In current research, it’s also not unusualto incorporate personality models in agentsto adapt this appraisal process as well asreasoning, behavior, and display of emo-tions to personality characteristics. So, wecan model a lot of useful and human-likeproperties in artificial agents, but, in RoddyCowie’s words, “If they are going to show

emotion, we surely hope that they wouldshow a little humor too.”1

What are the benefits of using humor?Humor helps to regulate a conversationand can help to establish common groundbetween conversational partners. It makesconversation enjoyable and supports inter-personal attraction.2 According to DeborahTannen, humor makes your presence felt.3

Many researchers have also mentioned itsbenefits in teaching and learning and havemade this role explicit in experiments.Humor contributes to motivation, attention,comprehension and retention of informa-tion, and the development of affective feel-ings toward content. It helps create a morepleasurable learning experience, fosters cre-ative thinking, reduces anxiety, and so on.

Embodied agents using humorEmbodied conversational agents (ECAs)

have been introduced to play the role, amongothers, of conversational partner for thecomputer user. Rather than addressing themachine, the user addresses virtual agentsthat have particular capabilities and areresponsible for certain tasks. The usermight interact with ECAs to engage in aninformation service dialogue or a transaction,to solve a problem cooperatively, to performa task, or to engage in a virtual meeting. Inthese kinds of situations, humans use humorto ease communication problems. In a sim-ilar way, humor can help solve communi-cation problems that arise within human-ECA interaction.

Researchers working on the ComputersAre Social Actors paradigm4 have convinc-ingly demonstrated that people interact withcomputers as if they were social actors.Depending on the way we can program acomputer to interact, people might find itpolite, dominant, extroverted, introverted,or any other attitudes or personality traitswe can think of. Moreover, people react tothese attitudes and traits as if a humanbeing were displaying them. From theCASA experiments, we can extrapolatethat humor, because of its role in human-human interaction, can play an importantrole in human-computer interaction. Exper-iments examining humor’s effects in task-oriented computer-mediated communica-tion and in human-computer interactionhave confirmed this.

More and more, we see ECAs employedin 2D or 3D virtual-reality educational andentertainment environments, e-commerceapplications, and training and simulationenvironments.5 Research projects suggestthat in the near feature, we might expect thatin addition to being domain and environmentexperts, ECAs will act as personal assistants,coaches, and buddies. They will accompanytheir human partners, migrating from dis-plays on handheld devices to displays em-bedded in ambient-intelligence environ-ments. Natural interaction with these ECAswill require them to display rational andsocial intelligence and, indeed, also a littlehumor when appropriate and enjoyable.

Computational humorWell-known philosophers and psycholo-

gists have contributed their viewpoints tothe theory of humor. Sigmund Freud sawhumor as a release of tension and psychicenergy, while Thomas Hobbes saw it as ameans to emphasize superiority in human

62 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

Humor helps to regulate a

conversation and can help to

establish common ground between

conversational partners. It makes

conversation enjoyable and

supports interpersonal attraction.

Page 5: Kim Binsted - Computational Humor

competition. In the writings of ImmanuelKant, Arthur Schopenhauer, and HenriBergson, we can see the first attempts tocharacterize humor as dealing with incon-gruity—that is, recognizing and resolvingincongruity. Researchers including ArthurKoestler, Marvin Minsky, and Alan Pauloshave tried to clarify these notions, and Vic-tor Raskin and Graeme Ritchie have triedto formally describe them.

As you might expect, researchers havetaken only modest steps toward a formaltheory of humor understanding. Generalhumor understanding and the closely relatedarea of natural-language understandingrequire an understanding of rational andsocial intelligence, so we won’t be able tosolve these problems until we’ve solved allAI problems. It might nevertheless be bene-ficial to look at the development of humortheory and possible applications that don’trequire a general theory of humor; this mightbe the only way to bring the field forward.That is, I expect progress to come fromapplication areas—particularly in games andother forms of entertainment—that requirenatural interaction between agents and theirhuman partners, rather than from investiga-tions by a few researchers into full-fledgedtheories of computational humor.

Incongruity-resolution theory providessome guidelines for computational-humorapplications. I won’t look at the many vari-ants that have been introduced or at details ofone particular approach. Generally, I followGraeme Ritchie’s theory;6 however, since Iprefer to look at humorous remarks that arepart of the natural interaction between anECA and its human conversational partner,my starting point isn’t joke telling or punmaking. Rather, I assume a small piece ofdiscourse consisting of two parts. You reador hear and interpret the first part, but asyou read or hear the second part, it turnsout that a misunderstanding has occurredthat requires a new, probably less obviousinterpretation of the text. So, we have anobvious interpretation, a conflict, and asecond, compatible interpretation thatresolves the conflict. Although misunder-standings can be humorous, this isn’t neces-sarily the case. Deliberate misunderstand-ing sometimes occurs to create a humorousremark, and it’s also possible to construct apiece of discourse so that it deliberatelyleads to a humorous misunderstanding. Inboth cases, we need additional criteria todecide whether the misunderstanding is

humorous. Criteria that humor researchershave mentioned deal with a marked con-trast between the obvious interpretationand the forced reinterpretation, and withthe reinterpretation’s commonsense inap-propriateness. As an example, consider thefollowing dialogue in a clothing store:

Lady: “May I try on that dress in thewindow?”

Clerk (doubtfully): “Don’t you think itwould be better to use the dressing room?”

The first utterance has an obvious inter-pretation. The clerk’s remark is confusingat first, but looking again at the lady’s utter-ance makes it clear that a second interpreta-tion (requiring a different prepositionalattachment) is possible. This interpretationis certainly different, and, most of all, itdescribes a situation that you might con-sider inappropriate.

What can we formalize here, and whatformalisms are already available? AIresearchers have introduced scripts andframes to represent meanings of text frag-ments. In early humor theory as it relatesto AI, these knowledge representation for-malisms were used to intuitively discussan obvious and a less-obvious (or hidden)meaning of a text. A misunderstandingallows at least two frame or script descrip-tions of the same piece of text; the twoscripts involved overlap. To make it clearthat the nonobvious interpretation is humor-ous, at least some contrast or oppositionbetween the two interpretations shouldexist. Script overlap and script oppositionare reasonably well-understood issues, butuntil now, although often described moregenerally, the attempts to formalize thisopposition mainly look at word-level oppo-sitions (for example, antonyms such as hot

versus cold). Inappropriateness hasn’t beenformalized at all.

Conversational humor:Constructing humorous acts

People smile and laugh when someoneuses humor. It’s not necessarily becausesomeone pursues the goal of being funny orof telling a joke, but because the conversa-tional partners recognize the possibilityof making a funny remark—deliberately,spontaneously, or something in between,taking into account social display rules. It’spossible to look at some relatively simplesituations that let us make humorous remarks.These situations fit in the explanations Igave earlier, and they make it possible tozoom in on the main problems of humorunderstanding: rules to resolve incongruityand criteria that help determine whether asolution is humorous.

Here I talk about surprise disambigua-tions. We can have ambiguities at pragmatic,semantic, and syntactic levels of discourse(text, paragraphs, and sentences). At thesentence level, we can have ambiguities inphrases (for example, prepositional-phraseattachment), words, anaphora, and, in thecase of spoken text or dialogue, intonation.As we interpret text that we read or hear,misunderstandings will become clear and beresolved, maybe with help from our conver-sational partner. Earlier, I gave an exampleof ambiguity that occurred because a prepo-sitional phrase could be attached to a syntac-tic construct (a verb, a noun phrase) in morethan one way. My main line of research,however, deals with ambiguities in anaphoraresolution. For example, consider the fol-lowing dialogue:

Adviser: “Our lawyers put your moneyin little bags, then we have trained dogsbury them around town.”

Dilbert: “Do they bury the bags or thelawyers?”

Here, “them” is an anaphor referringto a previous noun phrase. You can findantecedents among the noun phrases in thefirst sentence.

From a research viewpoint, the advan-tage of looking at such a simple, straight-forward humorous remark is that we canconfine ourselves to just one sentence. So,rather than having to look at scripts, frames,and other discourse representations, we canconcentrate on the syntactic and semanticanalysis of just one sentence. For this analy-sis, I use well-known algorithms that trans-

MARCH/APRIL 2006 www.computer.org/intelligent 63

General humor understanding

requires an understanding of

rational and social intelligence, so

we won’t be able to solve this

problem until we’ve solved all AI

problems.

Page 6: Kim Binsted - Computational Humor

form sentences into feature structure repre-sentations and issues such as script overlapand script opposition into properties of featuresets. Moreover, many algorithms for anaphoraresolution are publicly available. Erroneousanaphora resolution with the aim of creating ahumorous remark can make use of propertiesof possible anaphora antecedents.

Contrast and inappropriateness are globalterms from (not yet formalized) humor the-ory. In my approach, determining contrasttranslates into a heuristic that considers a potentially humorous antecedent anddecides to use it because it has many proper-ties in common with the correct antecedent.However, at least one salient property distin-guishes the two potential antecedents (ashop window versus a dressing room, abicycle versus a car, a bag versus a lawyer).My approach checks for inappropriatenessby looking at constraints associated with theverb’s thematic roles in the sentence. Forexample, these constraints distinguishbetween animate and inanimate; hence,burying lawyers who are alive is inappropri-ate. Obviously, you can say more about this—lawyers are easier targets for jokes than phar-macists, for instance—but such observationsare beyond this essay’s scope.

Experiments andimplementation

My research group is creating a chat botthat implements my approach to humorousanaphora resolution. One reason we chosea chat bot is that its main task is to get aconversation going. Hence, it might missopportunities to make humorous remarks,and when an intended humorous remarkturns out to be misplaced, this isn’t neces-sarily a problem. Implemented algorithmsfor anaphora resolution are available. Wechose a Java implementation (JavaRAP) ofShalom Lappin and Herbert Leass’ well-known Resolution of Anaphora Procedure.7

We obtained a more efficient implementa-tion by replacing the embedded natural-language parser with a parser from StanfordUniversity. Currently, we and other re-searchers are designing experiments to findways to deal with anaphora resolution algo-rithms’ low success rate and to consider theintroduction of a reliability measure beforeproceeding with possible antecedents of ananaphor. Other issues we’re investigatingare the different frequencies and types ofanaphora in written and spoken text. In par-ticular, we’ve been looking at properties

from the anaphora viewpoint of conversa-tions with well-known chat bots such asALICE, Toni, Eugine, and Jabberwacky.Resources that we’ve investigated are pub-licly available knowledge bases such asWordNet, WordNet Domains, FrameNet,VerbNet, and ConceptNet. For example, inVerbNet, every sense of a verb is mapped toa verb class representing the conceptualmeaning of this sense. Every class containsboth information on the thematic roles associated with the verb class and framedescriptions describing how the verb can beused. This lets you check whether a possi-ble humorous antecedent of an anaphorsufficiently opposes the correct antecedentbecause of constraints that it violates.Unfortunately, because VerbNet only con-tains about 4,500 verbs, many sentencescan’t be analyzed.

Conclusions and future research

As I mentioned earlier, when looking at thefundamental problems in humor research,we must wait until the main problems in AIhave been solved and then apply the resultsto humor understanding. It’s more fruitfulto investigate humor itself and see whethersolutions that are far from complete andperfect can nevertheless find useful appli-cations. In games and entertainment com-puting, natural interaction with ECAs requireshumor modeling. Although many forms ofhumor don’t fit into my framework ofhumorous misunderstandings, I think it’s auseful approach.

Current humor research has many short-comings, which are also present in myapproach. In particular, the conditions I’vementioned (such as contrast and inappropri-ateness) might be necessary, but they’re far

from sufficient. Further pinpointing of humorcriteria is necessary. My approach lets meapply well-known theories from computa-tional linguistics rather than obliging me tocontinue with archaic approaches and con-cepts. Moreover, my approach makes theissue of humor versus nonhumor much morevisible than in older approaches.

References1. R. Cowie, “Describing the Emotional States

Expressed in Speech,” Proc. Int’l SpeechCommunication Assoc. Workshop Speech andEmotion, Queens Univ., 2000, pp. 11–18;www.qub.ac.uk/en/isca/proceedings/pdfs/cowie.pdf.

2. A. Cann, L.G. Calhoun, and J.S. Banks, “Onthe Role of Humor Appreciation in Interper-sonal Attraction: It’s No Joking Matter,”Humor: Int’l J. Humor Research, vol. 10, no.1, 1997, pp. 77–89.

3. D. Tannen, Conversational Style: AnalyzingTalk among Friends, Ablex Publishing, 1984.

4. B. Reeves and C. Nass, The Media Equation,Cambridge Univ. Press, 1996.

5. H. Prendinger and M. Ishizuka, eds., Life-LikeCharacters: Tools, Affective Functions, andApplications, Springer, 2004.

6. G. Ritchie, “Developing the Incongruity Res-olution Theory,” Proc. AISB Symp. ArtificialIntelligence and Creative Language: Storiesand Humor, Univ. of Sussex, 1999, pp. 69–75.

7. S. Lappin and H.J. Leass, “An Algorithm forPronominal Anaphora Resolution,” Compu-tational Linguistics, vol. 20, no. 4, 1994, pp.535–156.

Automatic Production ofHumorous Expressions forCatching the Attention andRememberingOliviero Stock and Carlo Strapparava,ITC-irst

Humor is essential to communication. Itrelates directly to themes such as entertain-ment, fun, emotions, aesthetic pleasure, moti-vation, attention, and engagement, whichmany people in the field of intelligent userinterfaces believe are fundamental for futurecomputer-based systems. Furthermore,humans probably can’t survive withouthumor. Computational humor has the poten-tial to turn computers into extraordinarilycreative and motivational tools. So, human-computer interaction must evolve beyond

64 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

It might be beneficial to look at the

development of humor theory and

possible applications that don’t

require a general theory of humor;

this might be the only way to bring

the field forward.

Page 7: Kim Binsted - Computational Humor

usability and productivity. Even thoughhumor is complex to reproduce, it’s realis-tic to model some types of humor produc-tion and to aim at implementing this capa-bility in computational systems.

Humor, emotions, beliefs, and creativity

Humor is a powerful generator of emo-tions. As such, it affects people’s psycho-logical states, directs their attention, influ-ences memorization and decision making,and creates desires and emotions. Actually,emotions are an extraordinary instrumentfor motivation and persuasion because thosewho are capable of transmitting and evokingthem have the power to influence other peo-ple’s opinions and behavior. Humor, there-fore, allows for conscious and constructiveuse of the affective states it generates. Affec-tive induction through verbal language isparticularly interesting, and humor is one ofthe most effective ways of achieving it. Pur-poseful use of humorous techniques enablesus to induce positive emotions and moodand to exploit their cognitive and behavioraleffects.

Humor acts not only on emotions, butalso on human beliefs. A joke plays on thehearer’s beliefs and expectations. By infring-ing on them, it causes surprise and thenhilarity. Jesting with beliefs and opinions,humor induces irony and helps people notto take themselves too seriously. Some-times simple wit can sweep away a nega-tive outlook that limits people’s desires andabilities. Wit can help people overcomeself-concern and pessimism that can pre-vent them from pursuing more ambitiousgoals and objectives.

Humor encourages creativity as well.The change in perspective that humoroussituations create induces new ways of inter-preting an event. By making fun of clichésand stressing their inconsistency, peoplebecome more open to new ideas and view-points. Creativity redraws the space of pos-sibilities and delivers unexpected solutionsto problems. Actually, creative stimuli con-stitute one of the most effective impulsesfor human activity. Machines equipped withhumorous capabilities will be able to play anactive role in inducing emotions and beliefsand in providing motivational support.

BackgroundAlthough researchers have been study-

ing humor for a long time, including recent

approaches in linguistics1 and psychology,to date they have only made limited contri-butions toward the construction of compu-tational-humor prototypes. Almost all theapproaches try to deal with incongruity the-ory at various levels of refinement. Incon-gruity theory focuses on the element of sur-prise, stating that a conflict between whatyou expect and what actually occurs createshumor. This accounts for much of humor’smost obvious features: ambiguity or doublemeaning.

Kim Binsted and Graeme Ritchie madeone of the first attempts at automatic humorproduction.2 They devised a formal model ofthe semantic and syntactic regularities under-lying some types of puns. They then imple-mented the model in a system called JAPE(Joke Analysis and Production Engine) thatcould automatically generate amusing puns.

The goal of our HAHAcronym project3

was to develop a system that could automat-ically generate humorous versions of exist-ing acronyms or produce new, amusingacronyms constrained to be valid vocabu-lary words, starting with user-providedconcepts. The system achieved humorouseffects mainly on the basis of incongruity.For example, it turned IJCAI—InternationalJoint Conference on Artificial Intelligence—into Irrational Joint Conference on Antenup-tial Intemperance.

Humor recognition has received lessattention. Rada Mihalcea and Carlo Strap-parava investigated the application of text-categorization techniques to humor recog-nition.4 In particular, they showed thatclassification techniques are a viable approachfor distinguishing between humorous andnonhumorous text, through experiments per-formed on very large data sets.

Applied scenarios: Humorousadvertisements and headlines

Humor is an important way to communi-cate new ideas and change perspectives. Onthe cognitive side, humor has two importantproperties:

• It helps get and keep people’s attention.The type and rhythm of humor can vary,and the time involved in building theeffect can be different in different cases.Sometimes the context—such as joketelling—leads you to expect a humorousclimax, which might only occur after along while. Other times the effect occursin almost no time, with one perceptiveact—for instance, in static visual humor,funny posters, or in cases when somewell-established convention is reversedwith an utterance.

• It helps memory. For instance, we com-monly connect knowledge we’ve acquiredto a humorous remark or event in ourmemories. In learning a foreign lan-guage, an involuntary funny situationmight occur because of so-called “falsefriends”—words that sound similar intwo languages and might have the sameorigin but have very different meanings.The humorous experience is conduciveto remembering the word’s correct use.

No wonder humor has become a favoritestrategy for creative people in the advertis-ing business. In fact, among fields that usecreative language, advertising is probably thearea in which creativity (and hence humor)has the most precise objectives.

From an applied AI viewpoint, we believethat an environment for proposing solutionsto advertising professionals can be a realisticpractical application of computational hu-mor and a useful first attempt at dealing withcreative language.

Some examples of the huge array ofopportunities that language offers and thatexistent natural-language-processing tech-niques can cope with include rhymes, word-play, popular sayings or proverbs, quotations,alliteration, triplets, chiasmus, neologism,non sequitur, adaptation of existing expres-sions, adaptation of proverbs, personifica-tion, and synaesthesia (two or more sensescombined).

The humorous variation of newspaperheadlines is an important research direc-tion, given the particular kinds of linguisticphenomena they display. Indeed, it’s possi-

MARCH/APRIL 2006 www.computer.org/intelligent 65

Although researchers have been

studying humor for a long time,

to date they have only made

limited contributions toward the

construction of computational-

humor prototypes.

Page 8: Kim Binsted - Computational Humor

ble to exploit their elliptic nature, specificsyntax, and deliberate use of rhetoricaldevices to create funny variations, leveraginglexical and syntactic ambiguity (for example,“Marijuana issues sent to joint committee”5).

Resources and processingOur work aims at applicability in an unre-

stricted domain (the application is nonethe-less limited in scope; no general mechanismencompasses all kinds of humor). Ourapproach relies on standard, non-humor-oriented resources (with some extensionsand modifications) and the development ofspecific algorithms that implement and elab-orate suitable linguistic-humor theories.

ResourcesFor example, in developing HAHAcronym,

we used specialized thesauri, repositories,and in particular WordNet Domains, anextension developed at ITC-irst of thewell-known English WordNet. WordNetDomains annotates synsets (that is, synonymsets) with subject field codes (or domainlabels)—“Medicine,” “Architecture,” and soon. For HAHAcronym, we modeled anindependent structure of domain opposition,such as “Religion” versus “Technology,”“Sex” versus “Religion,” and so on as abasic resource for the incongruity generator.

In our current work, we use a pronuncia-tion dictionary, lexical knowledge basesincluding WordNet and WordNet Domains,WordNet-Affect (an affective lexical re-source based on an extension of WordNet),a grammar for acronyms, a repository ofcommonsensical sentences, a database ofproverbs and clichés, and idioms. Algo-rithms for making use of these resourcesinclude traditional components such asparsers, morphological analyzers, and spe-cific reasoners.

Creating the affective similaritymechanism

An important component to include insuch algorithms, especially for the kind ofexpressions we’re dealing with, is the affec-tive similarity mechanism. All words canpotentially convey affective meaning; evenapparently neutral ones can evoke pleas-ant or painful experiences. Although somewords have emotional meanings with respectto individual stories, for many others, theaffective power is part of the collective imag-ination (for example, “mom,” “ghost,” and“war”).

So, measuring a generic term’s affectivepower is important. To this end, we studiedthe use of words in textual productions,and in particular their co-occurrences withwords having explicit affective meanings.We must distinguish between words directlyreferring to emotional states (such as “fear”and “cheerful”) and those that indirectlyrefer to emotion depending on context (forexample, words that indicate possible emo-tional causes such as “monster” or emo-tional responses such as “cry”). We call theformer direct affective words and the latterindirect affective words.

We developed an affective similaritymechanism6 that consists of

• the organization of direct affective wordsand synsets inside WordNet-Affect and

• a selection function (called affective

weight) based on a semantic similaritymechanism. This similarity mechanismis automatically acquired in an unsuper-vised way from a large corpus of texts(hundreds of millions of words) to indi-viduate the indirect affective lexicon.

For example, if we input the verb “shoot”with negative valence, the system individu-ates the emotional category of horror. Thenit extracts the target-noun “gun” and thecausative evaluative adjective “frightening”and finally generates the noun phrase“frightening gun.”

Using the mechanism to produce humor

In most AI fields, researchers have under-stood the difficulties of reasoning on deepworld knowledge for a while. A clear prob-lem exists in scaling up from experiments

to meaningful large-scale applications.This is even more obvious in areas such ashumor, where good quality requires a sub-tle understanding of situations and is nor-mally the privilege of talented individuals.In this sense, we can refer to computationalhumor as an AI-complete problem. Ourgoal is to produce general mechanisms forhumorous revisitation of verbal expressions,to work in unrestricted domains.

Our approach can be synthesized in thefollowing elements.

• We tend to start from existing material—for instance, well-known expressionsor newspaper headlines—and producesome optimal innovation character-ized by irony or another humorousconnotation.

• The variation is based on reasoning onvarious resources, such as those wedescribed earlier. No deep representationof meaning and pragmatics exists at thecomplex-expression level. However,there is more detailed representation atthe individual-component level, permit-ting some type of reasoning (an exampleis the affective lexical resources indi-cated earlier where lexical reasoning isstrongly involved). For more complexexpressions, we adopt more opaque butdynamic and robust mechanisms basedon learning and similarity, such as theone we mentioned earlier. The potentialis particularly strong in dealing withnew material, such as reacting to novelexpressions in a dialogue or, as we areworking on now, making fun of freshheadlines.

• The system architecture accommodatesnumerous resources and revision mecha-nisms. The coordination of the variousmodules is based on a general optimalinnovation principle and numerousspecific humor heuristic mechanismsinspired by the General Theory of VerbalHumor.1

We’re working on postprocessing basedon mechanisms for automatic humor evalu-ation; this is an evolution of humor recog-nition work we mentioned in the “Back-ground” section. Right now, however, oursystem produces results in the form of a setof proposed expressions. The choice amongthem is left to the human user; few thingsare worse than poor humor, so it’s better notto risk it.

66 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

All words can potentially convey

affective meaning; even

apparently neutral ones can evoke

pleasant or painful experiences.

Many words’ affective power is

part of the collective imagination.

Page 9: Kim Binsted - Computational Humor

References1. S. Attardo, Linguistic Theory of Humor, Mou-

ton de Gruyter, 1994.

2. K. Binsted and G. Ritchie, “Computational Rulesfor Punning Riddles,” Humor: Int’l J. HumorResearch, vol. 10, no. 1, 1997, pp. 25–76.

3. O. Stock and C. Strapparava, “Getting Seriousabout the Development of ComputationalHumor,” Proc. 18th Int’l Joint Conf. ArtificialIntelligence (IJCAI 03), Morgan Kaufmann,2003, pp. 59–64.

4. R. Mihalcea and C. Strapparava, “LaughterAbounds in the Mouths of Computers: Inves-tigations in Automatic Humor Recognition,”Proc. Intelligent Technologies for InteractiveEntertainment (INTETAIN 05), LNCS 3814,Springer, 2005, pp. 84–93.

5. C. Bucaria, “Lexical and Syntactic Ambigu-ity as a Source of Humor,” Humor: Int’l J.Humor Research, vol. 17, no. 3, 2004, pp.279–309.

6. A. Valitutti, C. Strapparava, and O. Stock,“Lexical Resources and Semantic Similarityfor Affective Evaluative Expressions Gener-ation,” Proc. 1st Int’l Conf. Affective Com-puting and Intelligent Interaction (ACII 05),LNCS 3784, Springer, 2005, pp. 474–481.

The STANDUP Interactive Riddle-BuilderGraeme Ritchie, University of AberdeenRuli Manurung and Helen Pain,University of EdinburghAnnalu Waller and Dave O’Mara,University of Dundee

As children grow up, their language andcommunication skills develop as a result oftheir experience in the world and their inter-actions with other language users. In manycultures, an important type of interaction islanguage play with the child’s peer group—word games and joke telling. A child with adisability such as cerebral palsy might con-verse using a voice output communicationaid—a speech synthesizer attached to aphysical input device. This cumbersomeway of talking tends to isolate a child fromthe repartee, banter, and joke telling typicalof the playground. This lack of practice caninhibit the development of language skills,leading to a lack of conversational fluencyor even an undermining of social skills.

Our STANDUP (System to Augment Non-speakers’ Dialogue Using Puns) projectaims to take a small step toward alleviating

this problem by providing, in software, alanguage playground for children with dis-abilities. The program provides an interac-tive user interface, specially designed forchildren with limited motor skills, throughwhich a child can create simple jokes (rid-dles based on puns) by selecting words ortopics. Here are two typical jokes that thesystem produces:

• What kind of berry is a stream? A cur-rent currant.

• How is an unmannered visitor differentfrom a beneficial respite? One is a rudeguest, the other is a good rest.

The system isn’t just an online jokebook.It builds new jokes on the spot using 10 sim-ple patterns for the essential shapes of pun-ning riddles and a lexical database of about

130,000 words and phrases. We hope chil-dren will enjoy using the software to experi-ment with sounds and meanings to the bene-fit of their linguistic skills.

BackgroundAlthough various researchers have

attempted since 1992 to get computers toproduce novel jokes,1 STANDUP’s main pre-decessor is the JAPE system.2 That pro-gram could churn out hundreds of punningriddles, some of which children judged tobe of reasonable quality. However, it wasonly a rough research prototype; it took along time to produce results and had noreal user interface, and the ordinary usercouldn’t control it. We’ve used JAPE’s cen-tral ideas to create a fully engineered, large-scale, interactive riddle generator with auser interface specially aimed at childrenwith disabilities.

Designing with usersAs computational humor is still in its

infancy, we had no precedent for real-worlduse of a system such as STANDUP and cer-tainly no experience of providing a jokegenerator for children with disabilities. Wetherefore devoted a substantial portion ofthe project to user-centered design. Weconsulted potential users and associatedexperts (teachers and speech and languagetherapists) about how the system, particu-larly the user interface, should operate.3,4

In the early stages, we used nonsoftwaremock-ups. We showed users laminatedsheets representing screen configurationsand asked them to step through tasks bypointing to the buttons in the pictures. Theexperimenter responded by replacing eachsheet with the appropriate next screen shot.We adopted this low-tech approach toemphasize to the participants that the sys-tem hadn’t yet been built and that sugges-tions or criticisms at this stage could gen-uinely influence the software’s eventualdesign. Experience had shown that soft-ware mock-ups, particularly if very slick,give the impression that a working programis already available. This discourages par-ticipants from asking for changes and caneven distract them into asking how they canget hold of this apparently working program.

On the basis of our studies’ results, webuilt a software mock-up of the user interface(with a dummy joke generator) and tested itfor usability with suitable users. Teachersand therapists again gave their advice.

In parallel with this, we designed andimplemented the joke-generating back end.This incorporated a wide variety of facili-ties for manipulating words and joke struc-tures, but our studies with users and expertssuggested that this particular user groupneeded only a subset of these.

How the system worksYou can view the STANDUP program as

having two relatively separate major parts:the front end, which embodies the user inter-face and controls any user options, and theback end, which manages the lexical data-base and generates the jokes.

The user interface displays three mainareas: the general navigation bar, the joke-selection menu, and the progress chart (seefigure 3). The navigation bar is a standardset of buttons for going back, forward, exit-ing, and so on. The joke-selection menuconsists of large labeled buttons through

MARCH/APRIL 2006 www.computer.org/intelligent 67

The program provides an interactive

user interface, specially designed

for children with limited motor

skills, through which a child can

create simple jokes by selecting

words or topics.

Page 10: Kim Binsted - Computational Humor

which the user controls the joke generatorusing a standard mouse, a touch screen, ora single-switch scanning interface (rou-tinely adopted for those with limited motorskills). The progress chart shows where theuser is in creating or finding a joke, usingthe metaphor of a bus journey along a sim-ple road network, with stops such as “WordShop” and “Joke Factory.”

The back end (see figure 4) contains

several components: a set of schemas, a setof description rules, a set of text templates,and a dictionary. The schemas define thelinguistic patterns underlying punning rid-dles. For example, a schema might repre-sent information such as

Find items X, Y in the dictionary suchthat X is a noun or adjective, Y is a noun,and X and Y sound the same.

This describes the central relationships inan example like the current/currant jokeshown earlier.

The schema contains information thatmainly constrains the ingredients of theriddle’s answer, because that’s where thepun occurs in all the types of riddle we’veused. Once the system finds suitable itemsto match a schema, it passes these dictionaryentries on to the description rules, which fleshout the descriptive phrases needed for thequestion. For example, starting from “rude”and “guest,” it might try to find a way tobuild a phrase describing or meaning thesame as “rude guest,” such as “unmanneredvisitor,” or it might select two items thatcan be “crossed” to produce “rude guest,”such as “boor” and “visitor.”

For the third stage, text templates containcanned strings such as “What do you getwhen you cross a ... and a …” or “What doyou call a ...” alongside labeled slots intowhich the template-handling software slotsthe words and phrases provided by theschemas and the description rules, thus pro-ducing the final text.

The user can control this process via thegraphical interface by imposing constraintson answer building (the schema), questionbuilding (the description rules), or both.For example, the user might specify thatthe joke must contain a particular word, beon a particular topic, or be a particular typeof riddle.

Joke creation depends heavily on the dic-tionary, which contains information fromnumerous sources in a relational database.We took syntactic categories (such as nounand verb) and semantic relations (synonymy,being a subclass of, and so on) from thepublic-domain lexicon WordNet,5 whichcontains approximately 200,000 entries.For information about the sound of words,we used the Unisyn pronunciation dictio-nary to turn a word or phrase’s ordinaryspelling into a standard phonetic notation.For our final user trials, we attached pic-tures to as many of the words as possible,using proprietary symbol sets that two com-panies involved in creating software aids fordisabled users lent to us. The resulting lexi-cal database of around 130,000 entries con-tains several tables, each representing oneimportant relationship (such as between aword and its pronunciation or a word and itssynonyms). In this way, we were able to im-plement dictionary searches as SQL databasequeries.

68 www.computer.org/intelligent IEEE INTELLIGENT SYSTEMS

Figure 3. The STANDUP user interface.

Choose appropriate items

for question

Choose relateditems for answer

User constraints

Insertitems into textual

forms

Riddle

Dictionary

Schemas

Description rules

Text templates

Figure 4. The structure of the STANDUP back end.

Page 11: Kim Binsted - Computational Humor

Although the joke-generation ideas arerelatively simple and have already beentested in principle in the JAPE project,designing and implementing a large-scale,efficient, robust, easily usable systeminvolved considerable work. We tried to makeour designs as general as possible and to auto-mate as much of the dictionary creation aspossible, so that it should be relatively easyto create revised versions of the dictionary(for example, from a new version of Word-Net). We also hope to make the lexical data-base available to other projects.

How will children use it?We’re about to start our final evaluations

of the full system with users. This will involvevisiting schools in the surrounding area, bothspecial-needs establishments and mainstreamschools. There we shall see how children—both with and without language-impairingdisabilities—use the system. Beforehand,we’ll assess certain aspects of each child’sliteracy to give us a context for interpretingwhat we observe. The time available to us(a few months) isn’t sufficient for a long-term study of the software’s effects. How-ever, we’ll carry out some tests at the endof a child’s exploration of the system tosee if the sessions have been beneficial inany way.

This project is very much an explorationof possibilities, and we don’t know whatwe’ll find out. However, we hope that thework will help move computational humorfrom tentative research to practical applica-tions. In particular, a software languageplayground like this could well be of wideruse in educational settings.

AcknowledgmentsGrants GR/S15402/01 and GR/R83217/01

from the UK’s Engineering and Physical Sci-ences Research Council supported this work.We’re grateful to Widgit Software and Mayer-Johnson LLC for their help.

References

1. G. Ritchie, The Linguistic Analysis of Jokes,Routledge, 2004.

2. K. Binsted, H. Pain, and G. Ritchie, “Chil-dren’s Evaluation of Computer-GeneratedPunning Riddles,” Pragmatics and Cognition,vol. 5, no. 2, 1997, pp. 309–358.

3. R. Manurung et al., “Facilitating User Feed-back in the Design of a Novel Joke Genera-tion System for People with Severe Commu-nication Impairment,” Proc. 11th Int’l Conf.Human-Computer Interaction (HCI 05), CD-ROM, Lawrence Erlbaum, 2005.

4. D. O’Mara et al., “The Role of Assisted Com-municators as Domain Experts in Early Soft-ware Design,” Proc. 11th Biennial Conf. Int’l

Soc. for Augmentative and Alternative Com-munication, CD-ROM, ISAAC, 2004.

5. C. Fellbaum, WordNet: An Electronic LexicalDatabase, MIT Press, 1998.

For more information on this or any other com-puting topic, please visit our Digital Library atwww.computer.org/publications/dlib.

MARCH/APRIL 2006 www.computer.org/intelligent 69

Kim Binsted is anassistant professor inInformation and Com-puter Sciences at theUniversity of Hawaii.Contact her at [email protected].

Benjamin Bergen isan assistant professorin the LinguisticsDepartment at theUniversity of Hawaii,Manoa, where heheads the Languageand Cognition Labo-ratory. Contact him [email protected].

Seana Coulson is anassociate professor inthe Cognitive ScienceDepartment at the Uni-versity of California,San Diego, where sheheads the Brain andCognition Laboratory.Contact her at [email protected].

Anton Nijholt is aprofessor and chair ofhuman-media interac-tion in the Universityof Twente’s Depart-ment of ComputerScience. Contact him [email protected].

Oliviero Stock is asenior fellow at andformer director ofITC-irst (Center forScientific and Techno-logical Research).Contact him [email protected].

Carlo Strapparava is a senior researcher atITC-irst in the Commu-nication and Technolo-gies Division. Contacthim at [email protected].

Graeme Ritchie is asenior research fellow in the University ofAberdeen’s Departmentof Computing Science.Contact him at [email protected].

Ruli Manurung is aresearch fellow in theUniversity of Edin-burgh’s School of Infor-matics. Contact him [email protected].

Helen Pain is a seniorlecturer in the Univer-sity of Edinburgh’sSchool of Informatics.Contact her at [email protected].

Annalu Waller is alecturer in the Univer-sity of Dundee’s Divi-sion of Applied Com-puting. Contact her [email protected].

Dave O’Mara is a post-doctoral research assis-tant in the University ofDundee’s Division ofApplied Computing.Contact him at [email protected].