metaphor and artificial intelligence - school of computer...

P1: JZP9780521841061c18 CUUS110/Gibbs 978 0 521 84106 1 April 5 , 2008 9:43

C H A P T E R 18

Metaphor and Artificial Intelligence

Why They Matter to Each Other

John A. Barnden

Introduction

Why is Artificial Intelligence concerned withmetaphor, and what special contributionscan AI offer to metaphor research? Thischapter will indicate why AI needs to studymetaphor and will outline what AI hasbeen contributing to the illumination ofmetaphor, whether it is processed by arte-facts or by the human mind.

Specific contributions AI research onmetaphor that one can already point to, andthat will be addressed to varying extentsin this chapter, include the following: cre-ation of detailed mechanisms for reason-ing within the terms of the source-domainin a metaphor, in order expand the rel-evance of known source-target mappings;increased emphasis on uncertainty and grad-edness in metaphorical reasoning; a richerview of overriding (source-over-target aswell as target-over-source); mechanisms forexploiting context; important steps towardsintegration with metonymy interpretation;some emphasis on disanalogy and a limi-tation of the role of parallelism betweensource and target; the usefulness of reversed

transfers (transfers from source domain totarget domain); the importance of non-assertional metaphor; increased doubt aboutwhether the notion of a “domain” is actuallyimportant and well-founded; and clarifica-tion of ways in which literal meaning can beinvolved in metaphor interpretation.

The plan of the chapter is as follows. Thenext section will make some observationsabout AI, explain why metaphor is impor-tant to applications-oriented aspects of AI,and indicate why, in general terms, AI canmake distinctive contributions to the studyof cognition as a whole, metaphor included.Then a new section will sketch five differ-ent, relatively recent AI research works onmetaphor. This is to set the scene for thefollowing section, which will discuss specificcontributions of AI to metaphor research.The issues will be summarized in a briefconcluding section. The chapter does notattempt to survey AI research on metaphorcompletely or to provide a history of thiswork, despite the fact that AI has longhad an interest in metaphor (cf. e.g. Car-bonell, 1980, 1982 ; Norvig, 1989; Russell,1976, 1985 ; Way, 1991; Weber, 1989; Weiner,

311


312 JOHN A. BARNDEN

1984 ; Wilks, 1978), and also simile (e.g. Win-ston, 1979) and analogy (see Hall, 1989, fora review). Readers interested in AI work notcovered here may also wish to look at Mar-tin’s (1996) and Russell’s (1986) reviews andthe extensive review in Fass (1997, chap. 11).Also, we omit description of work on theo-retical approaches to metaphor that whilebeing interesting and important in them-selves do not address processing issues toany large extent, such as the approaches ofAsher and Lascarides (1995), Hintikka andSandu (1990), Indurkhya (1991, 1992), vanGenabith (2001), and Vogel (2001). For rea-sons of space we omit description of compu-tational study of metaphor in corpora (e.g.Mason, 2004) despite some close connec-tions to AI. The chapter makes some men-tion of metonymy because of the close con-nection of metaphor and metonymy andbecause, as we will see, some major AI workon metaphor also addresses metonymy.

Artificial Intelligence

AI has at least three separate, though inter-related, aims:

An “engineering” aim: To engineer, or pro-vide computational principles and meth-ods for engineering, useful artefacts thatare arguably intelligent, without necessarilyhaving any mechanistic similarity to humanor animal minds/brains. The usefulness maybe in an industrial domain or an everyday,practical domain, but may also be in otherdomains such as art or mathematical theo-rem proving.

A “psychological” aim: To devise compu-tational principles, computationally detailedtheories, or running computational systemsthat provide a basis for possible testableaccounts of cognition in human or animalminds/brains.

A “general/philosophical” aim: To devisecomputational principles, computationallydetailed theories, or running computationalsystems that serve as or suggest possibleaccounts of cognition in general, whetherit be in human-made artefacts, in natu-rally occurring organisms, or in cognizing

organisms yet to be discovered, or that illu-minate philosophical issues such as the natureof mind, thought, intelligence, conscious-ness, perception, language, representation,learning, rationality, society, and so on . . .not forgetting computation itself.

On top of this multiplicity of aims, theword “intelligence” is usually taken verybroadly in the field, to cover not only purerational thought but also almost anythingthat could come under the heading of “cog-nition,” “perception,” “language use,” “emo-tion” and so forth. Thus, the name “artificialintelligence” has always been somewhat of anom de plume, with both parts of the nameeach hinting at only one aspect of the natureof the actual endeavour.

The three aims are often inextricablycombined in a given piece of research. Forone thing, an individual researcher may sub-scribe to more than one of the aims. Butalso, of course, developments in pursuit ofany one of the aims could happen to inspireadvances towards one of the others, andendeavours towards any one of the aimscan proactively look for inspiration fromresearch towards the others.

Before going on, it is useful to explainwhy metaphor is important for the Engi-neering aim of AI. Many intelligent artefactsthat need to communicate well with peopleusing human language will need to be able tocope with metaphor. Metaphor is prevalentin human linguistic discourse, even whenit is just mundane conversation. Slightlymore indirectly, some intelligent artefactsneed to understand linguistic communica-tion between people, for instance for thepurpose of understanding newspaper articleswritten by people for other people. Indeed,metaphor is becoming an increasingly loom-ing obstacle for Engineering AI, as attemptsare made to bring better automated human-language processing into commercial prod-ucts, to develop ever more advanced com-puter interfaces and virtual reality systems,to develop automated understanding andproduction of emotional expression giventhat this is often conveyed explicitly orimplicitly by metaphor (Delfino & Manea,2005 ; Emanatian, 1995 ; Fainsilber & Ortony,


METAPHOR AND ARTIFICIAL INTELLIGENCE 313

1987; Fussell & Moss, 1998; Kovecses, 2000;Thomas, 1969; Yu, 1995), and also of ges-ture and sign language given that these formsof communication have strong metaphoricalaspects (McNeill, 1992 ; P. P. Wilcox, 2004 ;S. Wilcox, 2004 ; Woll, 1985).

To return to the set of aims overall, theirmultiplicity, and their nature taken individ-ually, cause problems in the evaluation ofdevelopments in AI. Engineering develop-ments can clearly be evaluated on the basisof actual usefulness or promise of such, butthe nature of evaluation is more difficult forthe other aims. Evaluation can be on criteriasuch as coherence, simplicity, computationalefficiency and so forth, and on whether thedevelopment in question does in principleachieve the intended cognitive ends, butbeyond that the evaluation must be in theindirect, long-term, and subjective sense ofthe extent to which the development con-tributes eventually to other fields such asPhilosophy or Psychology, or is at least per-ceived as embodying interesting and inspir-ing ideas for these fields. Since Psychology iscurrently the locus of intensive research onmetaphor, it is worth stressing that withinthe Psychological aim there is not neces-sarily any goal to produce an immediatelytestable psychological theory. Rather, theaim is creatively to provide computationallywell-founded and well-designed bases fromwhich psychologists or others could proceedto develop testable theories.

I hope that in the descriptions of the threeaims above the reader will have observedthe hedging about whether the AI develop-ments are actually “implemented” (that is,realized in the form of computer softwareor hardware). Hence the mention of compu-tational principles, methods and computation-ally detailed theories, not just working com-putational systems. A product of AI researchdoes not have to be a working computer pro-gram or piece of computer hardware. Rather,it can be a system description or formallogical account that is detailed and specificenough from which software or hardwarecould readily if laboriously be developed. Itcan also be a description of new types of rep-resentation, inferencing or other processing

that could form part of an AI system (imple-mented or otherwise).

Such products of AI may be left with-out implementation not through neglect butrather because they can be assessed, to a use-ful degree, in terms of their coherence, effec-tiveness, efficiency, interest, distinctiveness,and so on without being implemented. Also,the act of creating the product can uncoverproblems and issues that would be unlikelyto arise in less detailed and specific theoriz-ing. Much of the point of creating even aworking AI system is not so much to use it inpractice but to serve just such ends as uncov-ering problems and gaps, studying the rela-tionship to other proposed systems, and soon. In short, much of the point of develop-ing a detailed computational account, imple-mented or not, is aid in the development ofprinciples, methods and theories in more detailand with greater security than would other-wise be likely.

These explanations about AI could beparalleled to some extent by observationsabout Computer Science in general. Muchresearch in Computer Science is not directlyabout producing working software of hard-ware. For example, much of the field ismathematical theory directed at the natureof computation, the complexity of algo-rithms, the abstract meaning of computerprograms, and the well-founded designof programming languages and computersystems.

Given that the Psychological and Gen-eral/Philosophical aims of AI impinge onthe concerns of other disciplines the ques-tion arises as to whether AI research hasanything special to offer to such disciplinesover and above what they can do by them-selves. There are several reasons for a posi-tive answer. First, AI has special expertise ina wide variety of different forms of compu-tation, in putting them on a proper, well-thought-out foundation and, importantly,in finding complicated combinations of orcompromises between different forms. Thehope is that a strong Computer Sciencebackground or context enables many AIresearchers to come up with suggestions thatare, in computational ways, more advanced,


314 JOHN A. BARNDEN

richer, more subtle, more complex, moreformally coherent, and/or more extensivelyand securely developed than is generally pos-sible in other disciplines, with their owndemands and pressures concerning othermatters such as proper experimental design.

Pressure towards developing effectivecompromises and combinations comes fromthe applications focus within the Engineer-ing aim, and from the focus in all threeaims of the production of working artefactsor at least detailed computational schemesand methods. These foci can also providea useful “sanity check,” helping for exam-ple to uncover unwelcome but difficult-to-discern interactions between parts of atheory, to avoid vagueness in descriptions ofrepresentations and processes, to avoid over-simplification, and to ensure greater cover-age of underlying technical issues than inother fields.

AI Research on Metaphor: AnIllustrative Review of Recent Work

In outlining the nature of AI above welooked at some general reasons why AI isin a position to make helpful contributionsto the study of cognition, or, at least, whyit is in a better position to make certaintypes of advance than other disciplines are.As for specific metaphor research issues onwhich AI is in a relatively good positionto be helpful, we will examine some ofthem after reviewing, in this section, a hand-ful of particular metaphor research workswithin AI.

Hobbs

Important work on metaphor in AI was doneby Hobbs (1990, 1992). The ideas do notseem to have met with a substantial imple-mentation effort, but Hobbs has devised adetailed computational account from whichimplementations could be developed rea-sonably readily as an extension to the imple-mented TACITUS system (Hobbs et al.,1993). We can divide the work into the fol-lowing three strands:

1. Unmodified-property transfer : When Xis metaphorically described as Y, thismethod can attribute to X a property Pof Y, provided P also makes sense for Xwithout modification. A simple exam-ple is interpreting “John is an elephant”to mean that John is clumsy, given thatclumsiness is (let us assume) a propertyof elephants, and given that it can alsobe applied to people.

2 . Transfer by known mappings withininference: This method uses knownmappings between aspects of thesource domain and aspects of the tar-get domain. Importantly, unmappedaspects of the source domain can beused in a metaphor by virtue of theirsource-domain inferential connectionsto the source-domain elements that aremapped by known mappings. Also, themappings are themselves cast as infer-ence rules (see below). Thus, uses ofmappings are just inference steps alongwith any other.

3 . Mapping discovery by analogy: hypothe-sizing mappings between complex situ-ations in source and target from scratch,by means of structural matching, inorder to handle metaphor that is novel(to the understander).

All three strands are placed (in Hobbs,1992) within a general inferential frame-work for natural language understanding,which, in particular, also handles metonymy.This framework has abduction as its guid-ing principle and its central means of infer-ence. In essence, linguistic expressions areregarded as outward signs that are given riseto by underlying situations that are conveyedby the expressions, and the understander’stask is to abductively move from the out-ward signs to the underlying situations. Acrucial aspect of Hobbs’s overall abductiveapproach is that it is, thereby, an approachfounded on uncertain inference.

In the first strand, Unmodified-PropertyUsage, Hobbs has an appealing, context-driven view of how the properties areselected in a given case. As he says, “John isan elephant” cannot be precisely interpreted



outside of context (Hobbs, 1990, 59). But heclaims that, given suitable context, coher-ence considerations can lead to a preciseinterpretation. Thus, Hobbs asserts that“Mary is graceful, but John is an elephant”suggests the interpretation that John has aproperty that contrasts with gracefulness. Ifit is known that elephants are clumsy, andthis is the only elephant property that is anopposite of gracefulness, then the clumsi-ness interpretation is secured. This context-driven approach to the choice of propertiesto transfer contrasts with approaches thatuse selection principles relying on, for exam-ple, context-insensitive notions of salience ofproperties (as in Ortony, 1979).

In the third strand – mapping discov-ery by analogy – Hobbs proposes somethingnot much different in broad outline fromother researchers (e.g. Falkenhainer, Forbus,& Gentner, 1989; Gentner, 1983) who pro-pose analogical structure-matching as theway to deal with (some) metaphor. Arguablythe second strand, transfer by known map-pings within inference, is the most inter-esting of the three. Hobbs provides as aprime example the use of spatial metaphorin Computer Science. One talks of a vari-able in a computer program as being “at”,a number, say 100, as a way of saying thatthe variable’s value is 100. Hobbs proposestherefore that a communicating agent thatis familiar with this way of talking (or think-ing) could have an inferential rule that canbe glossed in English as

IF in some situation a variable’s value isV THEN in that situation the variable is[spatially-]at V.

Thus, this rule embodies a known mappinglink between the source domain of space andthe target domain of computer-science enti-ties. The rule has the same status as any otherinferential rule, and can be used at any con-venient point during an overall process ofinference-based understanding. It may lookstrange that the rule has the IF/THEN goingfrom target to source rather than source totarget. This is because rules are used abduc-tively in Hobbs’s approach: the variable’s

being spatially-at V leads to the abductivehypothesis that the variable’s value is V.

The displayed rule acquires an indefiniteamount of extra power in the following way.One talks of a variable “going” from say 100

to 200, as a way of stating a value change; of avariable being “between” two numbers; of avariable “keeping one step behind” another;and so forth: productively using an indefi-nite large part of the domain of space. Hobbsargues that such talk can be handled with-out the need to have separate mappings for“go,” “between,” and so on: rather, it suf-fices to use inferential connections withinthe source domain such as one between goingand being-spatially-at and, thereby to be ableto connect going to the mapping displayedabove from being-spatially-at to having-as-value. Thus, a variable’s “going” from 100

to 200 is ultimately interpreted as a changefrom a situation of having value 100 to a sit-uation of having value 200.

Finally, Hobbs (1990) regards metaphoras crossing over between different domains,but fully accepts that domains have fuzzyboundaries and that the notion of domainis difficult. He therefore propounds thatthe exact scope of the notion of metaphoris theory-relative, in depending on deci-sions about what domains exist: there isno objective, theory-free fact of the mat-ter about the boundaries of metaphor. Inany case, Hobbs’s actual computationalapproach does not impose or operationallyrely upon any domain divisions at all. There-fore, mapping rules could in principle linkconcepts that, intuitively, are arbitrarilyclose.

Martin

An implemented computer program, calledMIDAS, for metaphor interpretation andgeneration was produced by James Mar-tin (1990, 2000). The acronym MIDASstands for Metaphor Interpretation, Deno-tation, and Acquisition System. MIDAS wasdesigned in part as a supplement to the UnixConsultant system, a computer programfor automatically answering users’ questionsabout the Unix operating system.


316 JOHN A. BARNDEN

MIDAS incorporates knowledge of a setof a roughly Lakovian conceptual meta-phors. The specific set included can bechanged and is not itself the interesting sideof the system. We will assume here, for expo-sitional purposes, that MIDAS knows theconceptual metaphor USING A COMPUTER

PROCESS IS BEING PHYSICALLY INSIDE A

REGION. The system’s knowledge base con-sists of a network of concepts. Among theconcepts are the concept of using a com-puter process and the concept of being insidea region. These two concepts are linked by a“metaphor map.” The metaphor map will benotated here in the following way, althoughthe real structure is much more complex:

being-inside-a-region ↔ using-a-computer-process

Also, the two concepts have roles (or “slots”)within them. Correspondingly there are twoadditional metaphor maps, this time crossingbetween roles:

the-enclosing-region ↔ the-used-processthe-thing-enclosed ↔ the-process-user.

As a result of knowing the conceptualmetaphor, the system can easily understanda statement such as “I am in Emacs” tomean that the speaker is using Emacs, giventhat the word “in” accesses the being-inside-a-region concept. The literal interpretationthat the speaker is physically in Emacsis rejected, because Emacs is not repre-sented in the system as being a region.By contrast, the metaphorical interpretationis accepted because Emacs is representedas being a computer process. Importantly,though, the literal interpretation does notneed to be rejected before the metaphoricalone is accepted. The system tries to applythe possibly relevant conceptual metaphorsit knows of, irrespective of whether the lit-eral interpretation is acceptable.

MIDAS can also interpret metaphoricalutterances that do not immediately fit itsknown mappings. The process of handlingsuch utterances is handled by the MES(Metaphor Extension System) componentof MIDAS. It uses two different techni-ques: similarity-extension and core-extension.

Suppose the system knows that conversa-tions are similar to computer processes, inthe sense that they are both special cases ofa more general concept of a process. Thenthe system can interpret the sentence “Iam in a conversation” by using its knownconceptual metaphor USING A COMPUTER

PROCESS IS BEING PHYSICALLY INSIDE A

REGION. Because of the known similaritybetween COMPUTER PROCESSES and CON-

VERSATIONS, the system has a mechanismfor coming up with the new conceptualmetaphor BEING ENGAGED IN A CONVER-

SATION IS BEING PHYSICALLY INSIDE A

REGION.This similarity-extension method is pow-

erful, but core-extension is yet more so.The system can interpret the sentence “Howdo I get into Emacs?” just on the basis ofknowing the conceptual metaphor USING A

COMPUTER PROCESS IS BEING PHYSICALLY

INSIDE A REGION and knowing some simplethings about regions. The system is unableto find an acceptable interpretation usingthat conceptual metaphor directly. How-ever, through knowing about a result rela-tionship between the concept of moving-into(accessed by the phrase “get into”) to theconcept of being physically-in, and knowingthat a usage of a process by a user is a resultof the user invoking that process, the systemcan conjecture that the speaker is asking, ineffect, “How do I invoke Emacs?” The sys-tem will create a new conceptual metaphorINVOKING A COMPUTER PROCESS IS PHYS-

ICALLY MOVING INTO A REGION. The term“core-extension” is used because the con-cepts involved, such as moving-into andbeing-physically-in, must be “core-related.”This somewhat complex notion covers onlyrather direct relationships such as the resultrelationship involved above.

Martin seeks to avoid having a literal-first account and thereby to obey the“total time constraint” (Gerrig, 1989) thatconventional metaphors should take nolonger to process than superficially simi-lar literal language. MIDAS certainly avoidsbeing literal-first in the sense that itavoids the need to reject literal inter-pretations before considering metaphorical



ones. However, it does need to constructliteral interpretations before consideringmetaphorical ones.

As Fass (1997, 316) points out, MIDASis to be applauded for being able to prefera metaphorical reading of “McEnroe killedConnors” to a literal reading, even thoughthe latter is itself semantically acceptable.It turns out that the scoring mechanisms inthe system, which knows that McEnroe andConnors are sportsmen, cause it to regard aSPORTIVE DEFEATING AS KILLING interpre-tation as more tightly fitting the sentencethan a literal interpretation does, becausesports-defeat requires its role-fillers to becompetitors whereas killing has a much lessspecific requirement.

Martin does not make any use of thenotion of a domain in his account of MIDAS,and there are no explicit domain divisionsin MIDAS. Metaphor maps can in princi-ple join arbitrarily close concepts, and whatmetaphor amounts to for the system is there-fore entirely dependent on what maps hap-pen to be included and how existing concep-tual metaphors can be extended.

Fass

A second major implemented AI system formetaphor processing is that of Dan Fass(1997), indirectly related to the researchof Wilks (1978). Fass’s system is calledmeta5 (punningly, a step beyond metaphor).The system proceeds entirely by discoveringanalogies between source and target struc-tures from scratch, with the process beingguided by a relevance criterion explainedbelow. It should be mentioned at once thatthe analogies discovered are of a very sim-ple sort. However, the processing needed todiscover them can be complex and subtle.Also, the system is unusual in measuring thedegree of disanalogy between source and tar-get structures, and using this measure in rat-ing the aptness of the metaphor.

One standard-bearing example ofmeta5 ’s processing is provided by

(4) My car drinks gasoline

taken from Wilks (1978). The system caninterpret this as meaning “My car uses gaso-line” essentially by finding an analogicalmatch between the prior knowledge the sys-tem has that animals drink drinkable stuffand the prior knowledge that cars in generaluse gasoline. As a consequence, in construct-ing the internal meaning representation ofthe sentence, a use word-sense is employedas the right sense for the verb “drink” in thesentence.

In somewhat more detail, we can explainthe process as follows, assuming the sys-tem only has one lexical sense for the verb“drink,” namely the normal sense of an ani-mal imbibing a liquid. We notate this sensehere as drink. That the agent must be ananimal and the patient must be a liquidis encoded as “preferences” (or “selectionrestrictions”) in the permanent representa-tion of the lexical sense in the system. Thesystem finds, though, that the actual agentaccording to the sentence, the car, is notan animal. At some point the system willtherefore look for a possible metaphoricalway of interpreting the car-drink relation-ship in the sentence. It does this by seeingwhether its knowledge about animals con-tains an item that is relevant to the sen-tence. The approach here is simple: from thesentence it takes only the drink word-sense,notes this sense’s preference for an animalagent, and sees whether in the knowledgeabout animals there is information that theytake part in a relationship that is either drink-ing or a word-sense-wise ancestor of drink-ing. Indeed, the system finds the knowledgeitem that animals drink drinkable-stuff. Noother knowledge item for animal is relevant.

The system then looks for knowledgeitems within its prior knowledge of cars thatmatch that animal knowledge item. It findsthat the following matching item: cars usegasoline. It determines that there is a matchbecause the use word sense and the drinkword-sense are “sisters”: they both havethe same immediate parent sense, namelyexpend. Equally, the senses drinkable-stuffand gasoline are sisters, with liquid as parent.Such a pair of sister relationships betweentwo knowledge items is necessary for them


318 JOHN A. BARNDEN

to match. The system has now found ametaphorical relationship between “car” and“drinks” in the sentence, and can build a sen-tence meaning representation tantamount to“My car uses gasoline.”

The system also looks at its non-relevantknowledge items of animal and car, in theabove sense of relevance, and measures bothhow many other matching knowledge itemsthere are and how many knowledge itemsfor each of those two word-senses are notmatched by a knowledge item for the other.The extra matches contribute to the strengthof the metaphor, but the difference countsare inspired by the claim of Tourangeau andSternberg (1982) that the greater the con-ceptual distance between source and tar-get the more apt the metaphor. The countscan be used to choose between competingmetaphorical interpretations, in other exam-ples.

A point on which meta5 can be criti-cized, and is indeed criticized by Fass (1997,sect. 10.3 .1.1), is that there is no coordina-tion between a metaphorical relation foundbetween the patient and verb (“car” and“drink”) and a metaphorical or other relationfound between verb and patient (“drink”and “gasoline”). Thus, the system does notlook holistically at the sentence in deter-mining the presence of analogies. This cre-ates a problem with a sentence such as“My car drinks coffee,” which Fass wisheshis system to regard as anomalous and notmetaphorical, and therefore not to settle ona metaphorical relation between the car andthe drinking. Fass suggests a detailed way inwhich his problem could be fixed.

The incremental semantic constructionapproach in the (unfixed) system is in itselfinteresting because it means that the sys-tem does not first even construct a literalinterpretation of the whole sentence beforeinvestigating metaphoricity, let alone rejecta literal interpretation. But it is importantto note that in the investigation by thesystem of a part of the sentence, such as“my car” together with “drinks,” the systemdoes adopt a fully literal-first approach: ametaphorical relation is only sought if anacceptable literal interpretation cannot be

found for that part. Although it can beargued that this is a wrong approach even forsentence-parts, it does the service of show-ing us that the question of processing orderin metaphorical sentence interpretation ismuch more complex than that of how lit-eral and metaphorical interpretations of thewhole sentence are ordered.

The system includes a complex numer-ical scoring mechanism to choose betweencompeting interpretations of sentence partsas it goes along. This is largely based onlengths of paths in the semantic network.Fass (1997, sect. 10.2 .2) has implemented asystem extension in which the match scor-ing aspects of the system are enriched. Theenrichment adds a diagnostic-salience mea-sure on knowledge items that is dependenton how much inheritance was in involvedin finding them: e.g. that a car has a definitephysical boundary is inherited from furtheraway in the semantic network than that acar has wheels, and is therefore less salient.Differences of salience could then be usedto refine the comparative evaluation of dis-covered analogies.

On the other hand, there are majorproblems with the simplistic requirementthat metaphorical analogies require sisterrelationships between cell components. Forinstance, it appears that the metaphoricalinterpretation above could not be found if,instead of gasoline being a direct descen-dant of liquid, there were a liquid-fuelsense interposed. However, given that thesystem already includes complex distance-based scoring, it would be straightforward toadjust the system to allow generalized cousinrelationships rather than sister relationships,and to downplay or discount relationshipsthat involved excessively long paths.

Finally, meta5 is interesting in being afully implemented system that performscomplex metonymic understanding as wellas metaphorical understanding. It has knowl-edge of some conventional metonymic rela-tionships such as ARTIST FOR ART PRODUCT

and can therefore interpret sentences such as“John reads Shakespeare.” Indeed, the sys-tem can handle arbitrarily long chains ofmetonymy. A limitation of the system is that



metonymic interpretation is tried strictlybefore metaphorical, restricting the possibil-ities of interaction. The system can neverthe-less obtain some forms of mixed metaphor-ical/metonymic interpretation.

There is no notion of domain in the designof the system, and word-senses are not sortedby domain. Indeed, as the sister relation-ship (above) is the core of analogy in meta5 ,metaphorical relationships can be betweenstructures that are conceptually arbitrarilyclose up to sisterhood. Gasoline could havekerosene as a sister.

Finally, Iverson and Helmreich (1992)implemented a system, Metallel, that can beviewed as a substantially modified versionof meta5 , correcting some of its deficien-cies. The system is ably summarized by Fass(1997, sect 10.1). Metallel views metonymyand metaphor as being on a par, rather thanmetonymy having precedence as in meta5 .Once Metallel has found some potentialavailable metonymic and metaphorical inter-pretations by a somewhat loose form of pathsearch, it selects between them on the basisof a “grounding” process, which incorpo-rates a type of analogical matching muchlike meta5 ’s but that takes into account thewhole sentence, not just parts of it in succes-sion as meta5 does.

Barnden: ATT-Meta, Map-Transcendenceand Pretence

The present author has implemented anapproach, called ATT-Meta, for performinga type of reasoning that is arguably oftennecessary for metaphor interpretation. Theapproach is described in Barnden (1998,2001a), Barnden, Glasbey, Lee, and Walling-ton (2004), Barnden, Helmreich, Iverson,and Stein (1994), Barnden and Lee (1999,2001), and Lee and Barnden (2001a). Theimplemented ATT-Meta program is only areasoning system and does not take linguis-tic strings as input, but, rather, logical formsderived from sentences by initial process-ing. For now the reader can take these log-ical forms to encode the literal meanings ofthe sentences, but we will refine this pointbelow.

The metaphorical utterances of maininterest in the ATT-Meta project are thosethat are conceptually related to knownconceptual metaphors but that transcendthem by involving source-domain elementsnot directly handled by the mappings inthose metaphors. In ATT-Meta parlancethese utterances are map-transcending. Forinstance, going back to the Hobbs examples,the sentence “N leaps from 1 to 100” is map-transcending for an understander if he/she/itonly knows a physically-leap lexical sense forthe verb “leap” but does not know a map-ping for that sense into the target domain ofvariables and values, even though he/she/itdoes knows a mapping from, say, spatially-at to have-as-value. Similarly, if an under-stander knows a metaphorical mapping fromphysically-in to using-a-process (see Martincase) but has no mapping for physically-enter,then the sentence “How do I enter Emacs?”is map-transcending.

Clearly, map-transcendence is a fuzzyconcept that is relative to particularunderstanders and particular conceptualmetaphors the understander knows, and toour intuitive perceptions as to what is con-ceptually related to what (e.g. physically-leaping to being-spatially-at). Nevertheless,it is a useful intuitive characterization of aphenomenon that lies along a broad sec-tor of the spectrum between conventionalmetaphor on the one hand and, on theother hand, entirely novel metaphor whereno relevant mapping is known at all. Map-transcendence is strongly related to the phe-nomenon of unused parts of the sourcedomain as discussed in Lakoff & Johnson(1980).

Very broadly speaking, ATT-Meta’s app-roach is similar to Hobbs’s second strand(Transfer by Known Mappings withinInference): ATT-Meta is based on rulesencapsulating known metaphorical corre-spondences such as between physically-atand has-as-value, and on an integrated infer-ential framework which, in particular, allowsarbitrarily rich source-domain reasoning toconnect sentence components to source-domain concepts that can be mapped byknown mappings. So, both systems can infer


32 0 JOHN A. BARNDEN

that a variable N has value 100 from any sen-tence couched in spatial terms that impliesthat N is physically-at 100, as long as the sys-tems have the necessary knowledge aboutphysical space to infer that N is physically-at100 from the sentence. The inference can bearbitrarily indirect and complex in principle.To make the point, a vivid example wouldbe a sentence such as “N started a circuitousroute towards 100 but didn’t complete thejourney until after M fell to 0.” This implies,among other things, that N (at some point)had value 100.

However, there is a fundamental differ-ence of approach, as well as many tech-nical differences of representation and rea-soning, between ATT-Meta and Hobbs’sscheme. The difference is that ATT-Metaavoids placing internal propositions such asN is physically-at 100, which are not state-ments about reality, on a par with statementssuch as N has value 100, which are. Hobbs’sapproach does maintain them on a par: thereis nothing in his internal representation tosay that the former proposition is merely ametaphorical pretence or fiction.

Instead, ATT-Meta creates a special com-putational “mental space” in which suchpropositions and inference arising fromthem are kept aside from propositions andreasoning about reality. We call this spacea metaphorical pretence cocoon. Thus, theinternal proposition N physically-leaps from1 to 100 arising directly from the sentence“N leaps from 1 to 100” is placed in thecocoon, and the inference result that (say)N is spatially-at 100 afterwards, togetherwith the inference chain itself, lies within thecocoon. A metaphorical mapping rule thattakes spatially-at to has-as-value can thengive the result that, in reality, N has value100 afterwards.

By clearly marking some propositions asbeing pretences, the use of a cocoon ensuresthat the system is not misled by the propo-sitions directly derived from metaphori-cal utterances, that is, propositions like Nphysically-leaps from 1 to 100. Notice thatin the case of “McEnroe killed Connors,”the understander needs to be clear thatthe directly derived proposition McEnroe

biologically killed Connors is not a statementabout reality. But, in addition, if the under-stander knows that McEnroe definitely didnot biologically kill Connors in reality, wedo not want to let that information defeatthe pretend information that McEnroe didbiologically kill Connors. Thus, pretencecocoons prevent pretences from infectingreality but equally protect the integrity ofpretences.

The use of cocoons has another bene-fit. Lee and Barnden (2001a) studied mixedmetaphor of various types, and show howATT-Meta deals with them. The main dis-tinction studied was between serial mix-ing (commonly called chaining), where A isviewed as B and B is viewed as C, and parallelmixing, where A is used simultaneously as Band as C (see also Wilks, Barnden, & Wang,1991). Serial mixing is viewed as having theB material in a cocoon directly embeddedin the reality space, whereas the C mate-rial as in a cocoon embedded within the Bcocoon. Thus, there is a pretence within apretence. In parallel mixing, on the otherhand, the B and C material is either com-bined in a single cocoon or is in two separatecocoons both directly embedded within thereality space. Thus, we have two pretenceseither side by side or blended with eachother. There are unresolved issues about howto decide between these two possibilities,but in any case different dispositions of pre-tence cocoons allow important differencesbetween types of mixing of metaphor to bereflected in the processing.

We have indicated that what is initiallyinserted in the pretence cocoon in the caseof “N leaps from 1 to 100” is the propo-sition N physically-leaps from 1 to 100, andwhat is inserted in the case of “McEnroekilled Connors” is McEnroe biologically killedConnors. This reflects a general assumptionin the ATT-Meta approach that what isinserted in the cocoon is a “direct” mean-ing of the metaphorical sentence (or of somemetaphorical sentence-component such as aclause). A direct meaning is a logical formderived compositionally from the “direct”senses of lexical units in sentences. A directsense is just any sense listed for the lexical


METAPHOR AND ARTIFICIAL INTELLIGENCE 32 1

unit in the understander’s lexicon, so that itis directly accessible from the lexical unit. Inparticular, we have been assuming that theverbs “leap” and “kill” have as direct sensesthe concepts of physically leap and biologi-cally kill respectively.

Clearly, a given lexical unit could actu-ally have more than one direct sense, andindeed some of the direct senses could bemetaphorical or special in some other way.We simply embrace such possibilities, say-ing that if, for instance, “leap” had some-thing like change-value as a direct sense, then“N leaps from 1 to 100” could be under-stood without use of the inferential pre-tence mechanism outlined above, althoughin principle the mechanism could still beredundantly used as well. Equally, a directsense may be figurative in some way butstill lead to the construction of a proposi-tion in the pretence cocoon. For instance,suppose the word “star” has astronomical-starand prominent-movie-actor as its only directsenses, and that we regard the latter as a fig-urative sense. Then “Mike is a star of thedepartment” could be understood via thepretence mechanism using Mike is a promi-nent movie actor in the department in thecocoon. (Another option could be to use theastronomical-star sense.)

Thus, in the ATT-Meta approach, thepretence mechanism is potentially use-ful if direct meanings of sentence leadby within-pretence reasoning to within-pretence propositions that can be mappedby known mapping rules. It is irrelevantwhether the direct meaning is dubbed as“literal” or not. We may or may not wishto regard physically leap as a literal senseof “leap” and prominent-movie-actor as a lit-eral sense of “star”, but such terminologicaldecisions have no bearing in themselves onwhether the pretence mechanism could befruitful.

Another fundamental reason for not rely-ing on a notion of literal meaning arises fromserial mixing (A as B as C). In such a case,some of the phrasing in the utterance refersto the C domain, and this can cause materialto arise in the B domain by C-to-B transfer.Therefore, B-to-A transfers may be working

on non-literal material derived by transferfrom C. For this reason alone, it is misguidedto think of metaphorical mapping as a matterof transforming literal meanings. The con-sequences of this point have hardly beenexplored in metaphor research.

Insofar as direct meanings of sentencescan often be regarded as literal meanings,ATT-Meta is in the class of systems that relyon constructing a literal meaning first (notnecessarily from a whole sentence, but per-haps from a component such as a prepo-sitional phrase or clause). Still, there is noreliance on rejecting that literal meaningbefore proceeding to metaphorical process-ing.

Before proceeding further in this descrip-tion of ATT-Meta we also must explainthat its reasoning is entirely query-directed.Query-directed reasoning – more usuallycalled goal-directed reasoning – is a pow-erful technique much used in AI (see e.g.Russell & Norvig, 2002). In this form of rea-soning, the process of reasoning starts witha query – an externally supplied or inter-nally arising question as to whether some-thing holds. Queries are compared to knownpropositions and/or used to generate fur-ther queries by some means. In a rule-basedsystem as ATT-Meta, queries are comparedto the result parts of rules, and then newqueries arise from the condition parts. Forexample, in the case of a rule that says ifsomeone is a student then he or she is pre-sumably poor, a query as to whether Johnis poor would give rise to a subquery as towhether John is a student.

The system’s metaphor-based reasoningis thoroughly integrated into a general-purpose rule-based framework for uncer-tain reasoning using qualitative uncertaintymeasures. ATT-Meta’s reasoning both insource-domain terms and in target-domainterms is generally uncertain. Rules andpropositions are annotated with qualitativecertainty levels. There is a heuristic confiict-resolution mechanism that attempts to adju-dicate between conflicting lines of reasoning,by considering their relative specificity.

We are now ready to look in more detailat an example. Consider:



“In the far reaches of her mind, Annebelieved that Kyle was having an affair.”

This is slightly adapted from a real-discourseexample (Gross, 1994). We assume ATT-Meta is given knowledge of conceptualmetaphors MIND AS PHYSICAL SPACE andIDEAS AS PHYSICAL OBJECTS. We alsoassume that “far reaches” only has a spa-tial sense for the system and that the notionis not mapped to the mental domain byany conceptual metaphor known to the sys-tem. The most important mapping knownto ATT-Meta is the following, and is part ofATT-Meta’s knowledge of IDEAS AS PHYSI -

CAL OBJECTS:

degree of (in)ability of an agent’s consciousself to operate physically on an idea that isa physical object, in the pretence cocoon,corresponds to degree of (in)ability of theagent to operate in a conscious mental wayon the idea, in the reality space.

A given metaphorical mapping link such asthis is implicit in a set of transfer rules thatwe will not detail here.

In the example as we run it usingATT-Meta, the system is given an initialtarget-domain query (IQ) that is, roughlyspeaking, of the form To what exact degree isAnne able to consciously operate mentally onthe idea that Kyle had an affair? In Barndenand Lee (2001) we justify this a reasonablequery that could arise out of the surround-ing context. The query is reverse-transferredfrom target terms to source terms via theabove mapping to become a query of formTo what degree is Anne’s conscious self able tooperate physically on the idea?

ATT-Meta can then reason that thatdegree of physical operability is verylow, using the source-domain informa-tion gleaned from the mention of “farreaches” in the utterance and from common-sense knowledge about physical spaces andobjects. Once this very low degree is estab-lished in the source domain, it is forward-transferred via the mapping to give a verylow degree of conscious mental operabil-ity as the answer to the initial query (IQ).The program’s reasoning for this example istreated in more detail in Barnden and Lee

(2001). A variety of other examples are alsocomputationally treated in that report andBarnden (2001c), Barnden et al. (2002), andLee and Barnden (2001b).

We must note a largely unimplementedaspect of the ATT-Meta approach: “view-neutral mapping adjuncts” (VNMAs) (Barn-den & Lee, 2001; Barnden et al., 2003).With partial inspiration from Carbonell(1982)’s AI work on metaphor, we view cer-tain aspects of source domain informationsuch as attitudes, value judgments, beliefs,functions, rates, gradedness, uncertainty andevent structure to carry over to the targetdomain by default (the results can be over-ridden). For instance:

� We assume that the ordering of eventsand their qualitative rates and durationscarry over by default, whatever the natureof the particular metaphorical mappingbeing used, thus avoiding the need forindividual mapping rules to deal withthem.� If an agent A in the pretence has an atti-tude X (mental or emotional) to a propo-sition P, and A and P correspond, respec-tively, to an agent B and a proposition Qin reality, then B has attitude X to Q.� As for gradedness, if a property P in apretence corresponds to a property Qin reality, then a degree of holding of Pshould map to the same degree of holdingof Q (unless there is additional evidenceabout Q).

We have produced an experimentalimplementation that handles rates and dura-tions as VNMAs, but much work remainsto be done on other VNMAs. In particular,gradedness is currently handled directly inindividual rules – notice the degrees in themetaphorical correspondence used above. Inplace of this handling, we would like to haveinstead simpler mapping rules that do notmention degree, relying on a separate, gen-eral mechanism for the degree transfer.

Finally, the ATT-Meta approach does notrely on domain distinctions, even theoreti-cally, let alone enshrine them in some wayin the implemented system. Although in



this article we generally adopt the commonpractice of saying that metaphor transfersinformation from a source domain to a targetdomain, the ATT-Meta approach has a dif-ferent stance: metaphor is a matter of trans-ferring from a pretence to reality (or to asurrounding pretence, in the case of serialmixing). Notice that in the mapping rule setout above, reference is made to pretence andreality, not to domains. It does not matterwhat domains the information used in thepretence comes from, and this means thatit does not matter how we may intuitivelycircumscribe the source and target domainsin the metaphor. In particular, it does notmatter how close, difficult to distinguish, oroverlapping those domains are. In practice,it will often be the case that we can theoret-ically identify a source domain in which thedirect meaning of he sentence lies, and thatinferences from this meaning also lie withinthat domain. However, this has no bearingon the course of processing, and the reason-ing within the pretence is not limited by anyconsideration of domains.

Narayanan

Srini Narayanan has implemented a meta-phor-understanding system (Narayanan,1997, 1999) that has mostly been applied tointerpreting metaphorical statements abouteconomic policy, where the source domainis that of everyday physical movementactivities such as walking, as in the headline“Liberalization plan stumbling.” However,it would appear reasonably straightforwardto apply a modified version of the sys-tem to other source and target domains,and Narayanan (1999) mentions using ahealth-based source domain.

The system has been applied to manyutterances about economics from newspa-per articles, and has powerful facilities foraddressing subtle aspects of such utterances.However, much as in the case of ATT-Meta,the system does not take sentences as such asinput, but rather simple feature-value rep-resentations that could result from initialprocessing of sentences or other discoursefragments. The system is based on knowing

a set of conceptual metaphor maps such asACTING IS MOVING, OBSTACLES ARE DIFF I -

CULTIES, and FAILING IS FALLING.Examples of fragments successfully han-

dled include “Liberalization plan stum-bling,” “European Giant falls sick,” “takinga cautious step in the right direction” and“Economic reform is like crossing a river byfeeling for the stones.” Narayanan is espe-cially concerned to deal with aspect, i.e. theinternal temporal structure of events. Thesystem can deal with, for instance, the inter-mittent nature of an action such as rubbing,the aspect conveyed by the perfect tense,and aspect conveyed in phrases such as “startto pull out,” “on the verge of” and “back ontrack.”

Both the source domain and the targetdomain are represented as fixed networkstructures, of rather different types. The tar-get domain representation is a “belief net-work” (Pearl, 1986), in which nodes standfor economic variables needed for depict-ing the economic situations of interest. Thevariables include economic actors (examplevalue: Indian Government), economic pol-icy (example value: capitalism), status of apolicy, gross domestic product, geographi-cal location, rate of progress, level of diffi-culty (e.g. of implementing a policy), andgoals of actors. Each node is repeated acrossa small sequence of time slices (up to four),so that for instance there is a policy nodefor time 1, a policy node for time 2 , andso on. Nodes are linked together to rep-resent probabilistic relationships betweenvariables. For instance, the links allow theconditional probability of policy being such-and-such at time 2 given that it is so-and-soat time 1 and a policy failure happens at time1. When the belief network is used for infer-ence, particular probability values at nodesare fixed on the basis of input and metaphor-ical transfer, and then the links cause pos-terior probabilities for particular variablevalues at nodes to be calculated. In this way,the network can probabilistically model acomplex unfolding economic situation.

The source-domain representation is,roughly speaking, a type of marker pass-ing network in which (the main type of)



nodes represent states that can occur inactivities such as walking, falling and gettingup. Links between these nodes show howstates can (stochastically) be caused by pre-decessor states, and markers passing alongthese links simulate the progress of activi-ties.

The state nodes in the source domaininclude a subset that serve as the inputsto the system’s metaphorical maps. Forinstance, the DIFF ICULTIES ARE OBSTACLES

map responds to the presence of a markerin the bump node in the source-domain net-work and contributes to the setting of theprobability level at the difficulty node in thetarget network. One type of map, “parame-ter” maps, handles gradedness. For instance,velocity in the source domain is mappedto rate of progress of a policy in the tar-get domain, or distance travelled in walk-ing to degree of completion of an economicplan.

The processing within the source-domainnetwork allows rich examples of map-transcendence to be handled. For instance,consider any discourse fragment that men-tions an economic policy approaching a cliffedge. Recall that falling maps over to failing.Provided that the source-domain networkhas the right structures to predict fallingfrom walking to the cliff’s edge, the systemcan infer the target domain conclusion thatthe economic policy will fail.

Clearly, the system makes strong use ofsource-domain inference, if we regard themental simulation of activities within thesource domain as inference. Furthermore,it is uncertain inference, because of thestochastic nature of marker passing betweenstate nodes. It is clear also from the abovethat the system places great weight on grad-edness.

As for the role of literal meaning, considerthe sentence “Economic reform is like cross-ing a river by feeling for the stones.” Thiswill be input to the sentence in the form ofa setting of the source-domain network thatdepicts a fictional entity, corresponding toeconomic reform, crossing a river, and so on.In this sense, the system constructs a wholeliteral interpretation first. However, the

system does not itself evaluate whether eco-nomic reform can itself cross a river, so, aswith Hobbs’s approach, MIDAS and ATT-Meta, there is no sense in which the systemitself rejects a literal meaning before comput-ing a metaphorical one.

The system is, clearly, strongly founded ondomain distinctions, which are explicit in thestructure of the system. Given the intuitive,qualitative distance between economics andbodily movement, this might not, superfi-cially, appear to be a problem. However, var-ious types of extension or enrichment of thesystem could soon run into problems. Forone thing, mental processes are importantboth for physical activities in the world (e.g.reasoning about what to do at a crossroads)and in the economic domain, and this isalready weakly evident in Narayanan’s work.A more detailed treatment of mental pro-cessing in the two domains would requireseparate and differently organized networkstructures to handle mental states, whereasintuitively the two domains simply overlapon the matter of mental processes, whichthemselves could just as much be viewed asforming a domain.

Veale: The Sapper System

Tony Veale (Veale, 1998; Veale & Keene,1997) has constructed Sapper, an imple-mented hybrid sym-bolic/connectionistmodel for finding structural analogies. It isbased on a semantic network frameworkin which nodes stand for concepts andbetween which activation values can flow.The work on Sapper appears to be largelyseparate from Veale’s work on a “conceptualscaffolding” theory of metaphor (Veale &Keene, 1992).

Sapper does not take linguistic input assuch, but rather attempts to find a metaphor-ical mapping between any two concepts Sand T in its network that are from dif-ferent domains, for instance composer and[military] general. In this example, the sys-tem comes up with a rich metaphorical map-ping, involving component correspondencessuch as orchestra corresponds to army, musi-cian to soldier and musical-instrument to;



musket. In this way it is similar in orienta-tion to analogy-finding systems in CognitivePsychology, such as SME and ACME.Indeed, Veale has shown in much detail,both theoretical and experimental, that hissystem can find analogies similar to thosefound by SME and ACME, while perform-ing less processing.

Sapper has a long-term “bridge”-formingaspect and a short-term structure-matchingaspect. The former is done in advance of anyanalogy-finding, and finds potential analog-ical correspondences between concepts. Itdoes so by means of purely symbolic pro-cessing over the semantic network, based oncertain simple heuristics (a “Triangulation”rule and a “Squaring” rule). Such a poten-tial correspondence is called a bridge and isimplemented as a special link between thenodes.

Analogy-finding per se in a particularcase, such as for composer and general, con-sists of the short-term structure-matchingaspect. This aspect exploits the long-termbridges via activation-spread in a way tobe described shortly, and thereby constructsoverall, coherent mappings containing com-ponent correspondences such as orchestra toarmy in the example above.

Structure-matching works in outline asfollows, given two nodes S and T, thought ofas the source and target nodes respectively.Activation is sent out from S and T, to aprespecified distance (“horizon”) in the net-work. If the two waves of activation meet ata bridge between two nodes S and T, respec-tively, then the system sees if there is a chainof links from S to S that is isomorphic to achain of links from T to T. That is, the twochains consist of links of the same types inthe same directions. Then for each pair ofcorresponding nodes on the chains the sys-tem considers them to be mapped to eachother, and takes the overall mapping thusdefined by the chains to be a partial inter-pretation of the T-is-S metaphor. Now thesystem takes the “richest” partial interpre-tation found by this method, and considersthe remaining ones in descending order ofrichness, attempting to combine them con-sistently with the richest one. The final result

is Sapper’s overall metaphorical interpreta-tion of T-is-S.

The theory behind Sapper places impor-tant, explicit weight on domains, andDomain distinctions are used in the struc-ture-matching process. A domain in Sap-per is relative to a given “root” node. Thedomain for the node is the region of thesemantic network that is reachable from thenode via network links in a particular way.However, Veale does not appear to addressthe difficulties arise with source and tar-get domains that intuitively overlap, whichwould require that activation flow duringstructure-matching not be domain-confinedas he assumes it to be. For instance, drumsare used in bands in armies, not just in ordi-nary orchestras.

It appears that the processing in Sap-per is entirely symmetrical between sourceand target, so that for instance “a com-poser is a general” creates the same the samemetaphorical correspondences as “a generalis a composer.” This may look as thoughit goes against claims in the metaphor lit-erature (e.g. Ortony, 1979, 197) about theasymmetry of metaphor. However, it is notdifficult to bias the processing in Sapper inways that would asymmetrically affect theactivation flow and thus ensure asymmet-rical results. Also, Barnden (2001d) arguesthat asymmetry is a more subtle and delicatematter than it is usually portrayed as being;for example, the true asymmetry betweenS-is-T and T-is-S can reside in which par-ticular mapping links are used in interactingwith the overall discourse rather than withwhether the links themselves differ betweenS-is-T and T-is-S. Indeed, on his websiteVeale describes how Sapper does structuraltransfer, in a way roughly similar to otheranalogy systems. Structure on the source sidethat is not paralleled on the target side canbe transferred as “candidate inferences” tothe target side. Structural transfer fromsource to target involves different pieces ofdomain information from those involved intransfer from target to source, even when thesame metaphorical linkages are involved.

Sapper could be said to perform source-domain inference in using activation flow



within that domain. Activation levels re-present gradedness, for instance the degreeto which the property denoted by the nodeholds. The levels therefore do not representdegrees of certainty, as they do in many con-nectionist systems.

Discussion: Contributionsof AI to Metaphor Research

Here we examine some specific issues onwhich AI is being helpful to metaphorresearch. We will draw heavily on the pre-ceding review of particular AI approaches,but will also make additional observations.

Mundaneness

Non-AI research such as that of Lakoff(Lakoff, 1993 ; Lakoff & Johnson, 1980) andof many researchers in Corpus Linguisticsand Applied Linguistics has shown us thatmetaphor is an aspect of ordinary, every-day language, not just of literary or otherheightened forms. AI is in a peculiar posi-tion to add both to the appreciation of thevariety and complexity of metaphor as itarises in practical discourse and to the ques-tion of how to process real metaphor inpractical contexts, because of the inclusionwithin AI of applications-oriented research.One of the AI systems reviewed above(MIDAS, by James Martin) concentratedon metaphor arising in question-and-answersessions between users and an automatedUnix help system. Narayanan’s researchused the domain of economics as an appli-cation area. A research project led by thepresent author, not reviewed above butdrawing upon the ATT-Meta research, islooking at the metaphorical expression ofaffect (emotion, value judgments, etc.) inthe context of an e-drama system that sup-ports virtual dramatic improvisation by userssitting at computer terminals (Zhang, Barn-den, & Hendley, 2005). Improvisations canbe on any topic, but the system has in par-ticular been used for improvisations con-cerning school bullying and embarrassingillnesses.

Non-Assertional Metaphor

One consequence of looking at applicationsis as follows. In describing MIDAS we citeda metaphorical question as an example –“How do I get into Emacs?” It is remarkable,though not generally remarked upon, thatthe vast bulk of writing on metaphor hasconcentrated on assertions. Yet, metaphoris just as appropriate in questions, com-mands, and so on, as it is in assertions, andoften occurs in non-assertions in real dis-course. Non-assertional metaphor raises spe-cial issues. Questions and commands arenot centrally about conveying new informa-tion about the target or making the under-stander appreciate the target in a specialway, yet existing theorizing on the meaningsor connotations of metaphorical utterancespresupposes that some new information orspecial view of the target is being com-municated. In particular, whereas withan assertional metaphorical utterance anincompatibility between one potential inter-pretation and the target domain mayindicate that the interpretation is incorrect,in the case of a metaphorical question theincompatibility may mean simply that anegative response is needed or the speakerhas an incorrect supposition about the tar-get domain, so that an answer could bedirected at countering this. It could turnout that particular existing theories basedon assertion could be smoothly generalizedto deal with non-assertional metaphor, butthe issue needs at least to be explicitlyaddressed.

Details of Mappings

Much work on metaphor outside AIhas specified particular mappings betweensources and targets. The mappings are oftenbacked up by discursive accounts of howthey could help in the understanding ofparticular example utterances or types ofutterance. However, without their beingembedded in a detailed computational sys-tem it is difficult to determine whether,on the one hand, the mappings really doachieve all the effects they are credited



with, and whether, on the other hand, theysuccessfully avoid interacting to produceunwanted side-effects. In other words, map-pings proposed in non-AI literature onmetaphor are typically only vaguely evalu-ated as to coverage, coherence and effective-ness. In contrast, systems such as MIDASand ATT-Meta provide a framework withinwhich to do extensive experimentation withalternative sets of mappings.

Source-Domain Reasoningand Pretence Reasoning

Several of the reviewed AI systems (byHobbs, Martin, Narayanan) make crucialuse of online source-domain inference: infer-ence that is in terms of the source-domainsubject matter and that is made at thetime of trying to understand a metaphori-cal utterance. Source-domain reasoning wasalso briefly advocated in the seminal workof Carbonell (1982) on metaphor in AI.The ATT-Meta system is centred on theclosely related notion of within-pretencereasoning.

Now, source-domain inference has arisenquite frequently in the non-AI literature. Forexample, comments in Lakoff (1993) andLakoff and Turner (1989, 62 , 64 , 94) sug-gest the use of source-domain inference. Thediscussion of metaphorical inference pat-terns in Turner (1987) appears to allow foronline source-domain inference. The workof Ruiz de Mendoza Ibanez (1999) on inter-actions between metonymy and metaphorincludes mention of metonymy occurringwithin the source domain of a metaphor,and this amounts to a type of online source-domain inference. As for online within-pretence reasoning, Levin’s (1988) work onmetaphor in literature implies the use ofit, and van Dijk (1980) provides a tentativeaccount of metaphor in terms of counter-factuals. The “blending” (“conceptual inte-gration” approach) in Cognitive Linguistics(Fauconnier & Turner 1998), when appliedto metaphor, makes inference within theblend-space central. A blend-space is similarto a pretence cocoon in ATT-Meta, thoughthe latter concept is more computationally

specific while being unconstrained by no-tions of domain.

But the study of source-domain rea-soning and within-pretence reasoning inAI research on metaphor has givenfleshto and clarified the somewhat schematicand limited discussion of it in the non-AI literature. What AI can distinctivelycontribute is detailed, effective mecha-nisms for performing it. Complex tech-nical matters of representation, reasoningand evidence-comparison are involved here,especially when uncertainty and gradednessare brought in.

The reason for the intense attention tosource-domain and within-pretence reason-ing in AI may be that, in concentratingon real examples of metaphor in mundanecontexts, the researchers concerned havebeen affected by the fact that truly novelmetaphor is far from being predominant inmetaphor in real discourse, and have concen-trated on the rich, open-ended exploitationof already-known mappings. Source-domainor within-pretence inference enables themap-transcending aspects of the utterance –the aspects not directly handled by knownmappings – to be linked to the aspects thatare so handled. Map-transcendence is a cen-tral problem of metaphor that has not beenadequately treated, although the Hobbs,Martin, Narayanan, and Barnden approachesare important developments.

Economizing on Parallelism,and Use of Disanalogies

Hobbs, Narayanan, and Barnden all recog-nize that much or all of what one needsto get out of a map-transcending metaphor-ical utterance can often or perhaps usu-ally be got without finding target-domain cor-respondents for the map-transcending items.This stance is against the idea that the fun-damental task in metaphor understandingis to establish new mappings, indeed, toestablish as much parallelism as possiblebetween the two domains. Rather, the threeapproaches seek to exploit as far as pos-sible the already known mappings. In par-ticular, Barnden, Helmreich, Iverson, and



Stein (1996) explicitly championed the the-sis that it is often misguided to think thatmap-transcending source-domain elementsshould be expected to have a parallel inthe target, let alone to think that it isprofitable to look for it. For example, itseems excessive to expect the “dim recesses”mentioned in “The idea was in the dimrecesses of Tony’s mind” to actually corre-spond to any identifiable components of themind in reality, rather than serving merelyto connote physical inaccessibility within themetaphorical pretence. On the other hand,there are certainly situations where oneneeds to find some target-domain correspon-dents. The question of which these situa-tions are is an outstanding research issue, onwhich a start is made in Barnden and Lee(2001).

Relatedly, the benefits of attending todisanalogies between source and target inmetaphor deserve more study. Fass’s system(meta5) is unusual, and unique among thesystems reviewed, in regarding disanalogiesbetween source and target as a source of use-ful information.

Dissolving Metaphorical Transfersinto the Overall Processing

The Hobbs and Barnden approaches achievegreat flexibility in allowing target-domain(or within-reality) reasoning steps, source-domain (or within-pretence) reasoningsteps, and metaphorical transfer steps to bearbitrarily mixed together in a completelyuniform and task-dependent way. This flex-ibility is a contribution to conceptions ofhow the different types of processing inmetaphor can fit together. Most discussionsof metaphor appear to assume that transfersteps occur in some special phase of process-ing.

The flexibility of mixing is aided by cast-ing mappings as inference rules that areapplied just in the same way as other rules.Usually in metaphor research, whether in AIor elsewhere, mappings are a different sortof entity, which inhibits even the realizationthat a uniform treatment would be liberatingand beneficial.

Context and Extent

It is often pointed out that the informa-tion conveyed by a metaphorical utter-ance can be highly sensitive to context,and a considerable amount of psychologi-cal experimentation and philosophical the-orizing has addressed this (e.g. Giora, 1997;Leezenberg, 1995 ; Stern, 2000). Context isimportant for the understanding of muchnon-metaphorical language as well, butmetaphor heightens its effect.

The sentence “Mike is a rock” is highlyindeterminate as to what it might convey,absent any specific context. Perhaps thespeaker is intending to convey that Mike canbe relied upon. However, in “Mike’s friendsare very upset by criticism, but he’s a rock”the contribution of “rock” is much more def-inite. It is probably not getting at Mike’sreliability: the sentence is arguably sayingthat Mike is highly tolerant of criticism, andif so it is presumably exploiting a correspon-dence between invulnerability of rocks tophysical assault and tolerance of criticism bypeople.

In this example the disambiguating con-text about Mike’s friends and criticism isnear to the metaphorical clause, but in othercases the necessary contextual informationmight arise from further afield, and mighthave to be derived from the surroundingpassage or other information by subtle orknowledge-intensive processes of inference.Thus, a full approach to metaphor must dealwith possibly complex, extensive passages ofdiscourse, and complex inference.

Although AI work on metaphor hasyet to address context fully, some of thesystems reviewed above give context acrucial guiding role and are at least in aposition to accommodate its effectssmoothly. Hobbs and Barnden placemuch weight on reasoning goals derivedfrom context as a crucial driver of whatmetaphorical interpretations are drawn,and their approaches are unusual amongstdetailed metaphor-processing schemes inthis respect. Contextual-goal drivennessis a powerful tool not only against theoften-noted indeterminacy of metaphorical



meaning (see e.g. Stern, 2000) but alsoagainst the problem of inappropriate orirrelevant aspects of the source domaingetting in the way (such as the shape of apig’s tail when classifying a person as a pig).In a contextual-goal driven approach, thoseirrelevant aspects will simply tend not to bequeried.

As we have made clear, many authorsoutside AI have discussed the importanceof context. What AI can contribute isdetailed, computationally tested mecha-nisms by which it can be brought tobear.

Uncertainty

The information gained from metaphor isgenerally uncertain. The indeterminatenessof the import of “Mike is a rock” with-out a sufficiently specific context is itself atype of uncertainty. But even with the con-text shown above, we cannot be certain thatMike is tolerant of criticism (according tothe speaker). Perhaps, after all, the speaker isintending to convey that Mike can be reliedupon to give support to his colleagues whenthey are upset by criticism.

But even if an interpretation in termsof Mike’s tolerance to criticism is correct,we cannot be certain about the degree oftolerance: perhaps the speaker is merely try-ing to say that Mike has a normal level oftolerance, in contrast his colleagues’ markedlack of tolerance. After all, different typesof rock have different degrees of vulnerabil-ity to physical assault, and, without furtherinformation, it can merely be a presumptionthat a rock has a high degree of invulnerabil-ity.

Therefore as well as the uncertainty aris-ing between there being qualitatively differ-ent possible interpretations (e.g. one appeal-ing to reliability and one appealing to toler-ance), there is also uncertainty arising fromwithin the source domain itself. Anotherexample of the latter phenomenon wouldarise from talking about someone “burying”an idea in his mind. In the physical world,once something is buried it (at best) onlynormally stays buried. There can therefore

be no certainty that the idea will not “popup” again.

Most work on metaphor outside AIsidesteps detailed considerations of uncer-tainty, although systems such as SME andACME, where there are scoring mecha-nisms, do provide some support for a restri-cted type of uncertainty handling. Amongstour reviewed AI systems, Narayanan,Hobbs, and Barnden all allow the system’ssource-domain reasoning and target-domainreasoning to be uncertain. Uncertainty isimportant for making the overall process-ing do justice to people’s use of metaphor,but unfortunately greatly complicates thetechnical nature of the computationalframework.

Source/Target Overrides

The uncertainty issue also reveals the impor-tance of often allowing information trans-ferred from source to target to overrideinformation about the target. This possibil-ity is under-studied in metaphor research,because usually the information about a tar-get domain is cast simplistically in the formof certainties which cannot be overridden.This practice has led to researchers, out-side AI and within, almost exclusively con-centrating on the fact that target-domaininformation must sometimes override whatcomes from the source. Of course, this isindeed appropriate in many cases: since itis certain that France and Germany are notcognitive agents and are therefore incapableof love, metaphorically casting the relation-ship of those countries within the EU as a“marriage” (Musolff, 2004) should not leadto the result that they love each other inreality.

But, if a piece of target-domain knowl-edge is not certain, but let us say merelya default, there is no reason in principlewhy the information should not be over-ridden by transfers form the source. Thus,“SnakeByte Technologies nursed its com-petitor RabbitWare Inc. back to health”would override the default that compet-ing companies do not normally deliberatelyhelp each other. The utterance “In the far


330 JOHN A. BARNDEN

reaches of her mind, Anne believed thatKyle had been unfaithful” defeats the normalpresumption that people’s thoughts abouttheir spouses’ possible affairs are central andconscious ones. It may even be that oneimportant function of metaphor is to con-vey situations that are exceptions to target-domain defaults. The exception-expressingfunction of metaphor may be especially sig-nificant given that exceptional situations areless likely to be easily expressible using theresources native to the target domain.

It appears that only in the context of theATT-Meta system has the process of source-over-target overriding been studied in com-putational detail, though see Indurkhya(1992 , 85–86) for other comments on theimportance of such overriding. In ATT-Meta, both directions of override are possi-ble, depending on the fine detail of the rea-soning lines involved in particular cases.

Gradedness

The rock example above brings outthe importance of matters of graded-ness (degree) in metaphor. It is grada-tions, not black-and-white propositions, thatmetaphor is often getting at, a point thatdeserves greater emphasis in metaphorresearch. The interpretation suggested for“Mike’s friends get very hurt by criticism,but he’s a rock” was not the bald proposi-tion that Mike is tolerant of criticism butthat he is highly so. Equally, the sentence“The memory was hidden far back in thelabyrinth of John’s memory” plausibly doesnot convey that the memory was completelyinaccessible to John but rather that it washighly inaccessible, or very difficult to access.A range of specific examples of gradednessin metaphor interpretation can be found inBarnden (2001b, 2001c).

Once gradedness and uncertainty areconsidered it also becomes evident that ametaphorical utterance may not necessar-ily introduce totally new information butmay rather change the degree of hold-ing, and/or the certainty, of some existingpiece of information. Gibbs and Tendahl(2006) discuss this under the heading of the

“strengthening” of (and the opposite: contra-diction of) existing assumptions, in the lightof considerations of metaphor in RelevanceTheory (Carston, 2002 ; Sperber & Wilson,1995). In the rock example, other evidencemay already have established that Mike maybe somewhat insensitive to criticism, so thesentence is both strengthening the may topresumably and strengthening the somewhatto highly. Note also that such strengtheninggoes beyond the notion that metaphor candraw attention to or increase the salience of(Ortony, 1979) pieces of information aboutthe target domain. We are talking insteadabout adjusting pieces of information aboutthe target domain.

It cannot be claimed that AI or any otherfield has developed generally accepted, com-prehensive methods for handling graded-ness. Nevertheless, Narayanan and Barndenplace weight on the handling of gradednessand the transfer of graded information fromsource to target. Perhaps as important thanthe actual handling of gradedness in somerecent AI metaphor systems is the sheer factthat the pressure in AI towards consideringthe details of processing practical examplesin realistic contexts makes one more readilyappreciate the central role that gradednessplays in metaphor (going beyond the obvi-ous role of gradedness in scale-based concep-tual metaphors such as MORE IS UP).

Domain Distinctions

Metaphor is frequently characterized as amatter of mappings or transfers between dif-ferent “domains,” often to make a contrastwith metonymy, which is often claimed tooperate within a single domain. On the otherhand, some authors have questioned the use-fulness of the domain notion or the degreeof distinctness that is required between thetwo domains in a metaphor (see e.g. Dirven& Porings, 2002 ; Kittay, 1989). For simplicityof discussion we have mostly used the notionof domain uncritically in this article. It is cer-tainly true that in much metaphor there isan intuitive sense in which the source andtarget are qualitatively very different. Thequestion is whether real sense can be made



of this and whether it matters to metaphorprocessing anyway.

The present author has found in his ownAI work on the ATT-Meta approach thatthe detail and clarity required for well-founded computational implementation tobe a major factor in his coming to doubtthe usefulness of the concept of “domain” instudying metaphor (and metonymy). In try-ing to make decisions about what domainsparticular pieces of knowledge should beassigned to he came to realize what a hope-less and arbitrary task it was. The resultingdespair was relieved by an ultimate realiza-tion that having domain distinctions was notoperationally useful in any case.

The nature of the other systems in thereview above also throws doubt on theusefulness of the notion. Only Veale andNarayanan actually have domains affect howtheir systems are structured and how theprocessing works. Hobbs does believe thatmetaphor is a matter of mapping betweenqualitatively disparate domains, but thisstance has no operational effect in his sys-tem. In contrast, Barnden regards this dis-parateness as merely being a common caseand is happy for the two sides of a metaphorto be arbitrarily close in their qualitativenature. Metaphors such as “Thatcher wasBritain’s Reagan” are common, and havesource and target domains that are broadlysimilar in subject matter. For an examplewith even less qualitative distance betweenthe two sides, one’s neighbour’s teenagechildren can act as a metaphor for one’sown: if one has a daughter Jenny andthe neighbours have a son Jonathan whobehaves similarly to Jenny, then one couldsay “Jenny is our family’s Jonathan.” Ofcourse, it is open to someone to say thatthe Jenny family is qualitatively differentfrom the Jonathan family, and that theyare therefore different domains, but this ispost hoc rationalization with no operationalsignificance.

Despite the closeness between target andsource in the Jenny/Jonathan example, themetaphorical utterance appears quite aptto the present author. If this impression isshared with others, it may appear to conflict

with the evidence adduced by Tourangeauand Sternberg (1982) that the greater theconceptual distance between source andtarget the more apt the metaphor. How-ever, note that the linguistic form of themetaphorical utterance and the presence ofcontext are important factors. A bald state-ment that “Jenny is Jonathan” without muchcontext might well not come over as apt.

Apart from considerations of overall qual-itative closeness, there is often a con-siderable amount of overlap between theintuitive source and target domains inmetaphor even when they otherwise differ agreat deal. We noted some overlap betweenthe economics (target) and health (source)domain in the Narayanan discussion –and we could also have pointed out thathealth services are part of the economy – andbetween the orchestra and army domains inthe Veale discussion. With reference to theFass discussion, the domain of cars involvesthe domain of animals because cars can carrypeople and other animals.

It is quite possible to maintain a fictionthat domains do real work in metaphor aslong as one only deals schematically withsome isolated examples, and does not tryto come up with a unified and processu-ally detailed approach to metaphor that canwork on a wide variety of metaphors on thebasis of the same overall knowledge base.

Relationship to Metonymy

The relationship of metaphor to metonymyis highly contentious and complex (Dirven &Porings, 2002 ; Fass, 1997). It has proved dif-ficult to distinguish clearly between the twophenomena, and they may be at ends of aspectrum within which many compromisesare possible. Particular discourse examplesare often hard to classify as to whetherthey exhibit metonymy or metaphor. Also,metaphor and metonymy often co-occur inrichly interactive ways in discourse. How-ever, there has been little work on process-ing accounts that handle both phenomena.As it happens, two of the AI approachesreviewed above – those of Hobbs and Fass –pay much attention to metonymy as well


332 JOHN A. BARNDEN

as to metaphor, and allow certain types ofinteraction. They complement work suchas that of Ruiz de Mendoza Ibanez (1999)and Goossens (1990) outside AI. Hobbs’sapproach is perhaps especially noteworthyin that, as in the case of metaphor, it embedsmetonymy as just one type of inferencewithin the system’s inferencing as a whole(Hobbs et al., 1993). Therefore, in princi-ple, arbitrarily complex and diverse mixesof metaphor and metonymy should be ableto be handled, and it is likely that compro-mises between metaphor and metonymy arepossible.

If domains are abandoned as a well-founded underpinning for metaphor, thenmetaphor cannot be distinguished frommetonymy on the usual ground of between-domain moves versus within-domain moves.Thus, any profound effect that metaphorresearch in AI and other disciplines may ulti-mately have on the fate of domains mustbe matched on by profound effect on themetaphor/metonymy relationship.

The Literal: Its Nature and Use

Strongly related to the domains issue is atheme that appears throughout the fieldof metaphor, and continues to be a mat-ter of debate in the field (Gibbs & Ten-dahl, in press): the role, if any, of the lit-eral meaning of metaphorical utterances orwords in them in deriving their metaphoricalmeaning.

Of the systems reviewed, only Fass’s(meta5) has any use for the idea of having toreject a literal interpretation before consider-ing a metaphorical one, and even in his casethe incremental semantic processing (whileproblematic in itself) means that the rejec-tion is by sentence-part rather than by wholesentence. See Lytinen, Burridge, and Kirt-ner (1992) for another system with a relatedincremental quality.

The approaches of Hobbs, Martin, andNarayanan do rely on constructing a lit-eral interpretation of a metaphorical sen-tence, or sentence-like subunit such as aclause. Barnden’s approach is similar inthis respect, though there it is a “direct”

meaning that is constructed, with the ques-tion of whether it is necessarily to be calledthe literal interpretation being left as a ter-minological side issue. It should not befeared that there is necessarily any conflictbetween these approaches and psycholog-ical experimental results about metaphorprocessing being about as fast as, or some-times faster than, literal-language processingunder certain conditions (see e.g. Gibbs &Tendahl, in press, for a discussion of suchresults). This speediness does not of itselfshow that literal meanings are not beingcomputed. The evidence on these mattersfrom psychological experiment is mixed,because it is bound up with the natureof the context of the metaphorical utter-ance and the novelty or otherwise of itsmetaphorical elements: context could byitself suggest part or all of the meaning, anda piece of familiar metaphorical terminol-ogy could have its target-domain meaninglisted in a lexicon. Also, the type of literal(or direct) meaning that is constructed inthe aid of metaphor understanding is plau-sibly less fully fledged than that needed incases where the linguistic string really shouldbe interpreted literally. In the latter case,the literal meaning itself needs to involveintegration with the context, whereas in themetaphor case it is instead the metaphor-ical meaning that needs to be fully inte-grated with context. It is possible all thatthe metaphorical processing is adding isthe occasional hop from a complex source-domain (or pretence) scenario into a target-domain (or reality) scenario, and the time forsuch hops could be swamped by the timeneeded for all the other processing goingon, such as anaphor resolution and seman-tic/pragmatic inferencing of many othertypes. AI can contribute here in clarifyingthe overall computations needed and howthey can be imaginatively structured andoptimized.

Finally, note that serial mixing (chaining)of metaphor complicates the role of literalmeaning in metaphor, as noted in the dis-cussion of ATT-Meta. What is transferredonline in metaphor can already be a prod-uct of online metaphorical transfer.



Transfer of Attitudesand Value Judgments

A metaphorical utterance often conveys orinstigates a mental or emotional attitudeor a value judgment about the target sub-ject matter. This is perhaps especially preva-lent in metaphor used in political discourse(see e.g. Musolff, 2004). The attitude orjudgment can be on the part of some per-son mentioned in the discourse, or it canbe on the part of the speaker/hearer. Forinstance, talking about somebody’s mind asif it were a “cess-pit” may be intended tomake the hearer have an emotional revul-sion to, or negative value judgment of, theideas of that person. On the other hand, say-ing that “The problem crushed Mike into theground” primarily conveys something aboutMike’s emotions, although of course it canalso engender the meta-emotion of sorrowover Mike’s fate.

Although attitudes such as emotions andvalue judgments are of widely recognizedimportance for metaphor, it is importantto have detailed accounts of how exactlythey may be processed in metaphor under-standing. The processing of attitudes inter-acts heavily with ordinary inferencing, ratherthan being an isolatable matter. In addition,emotions and value judgments are intrinsi-cally graded, so the theme in this subsectioninteracts strongly with the general gradenessissue we identified above.

The description of the ATT-Meta projectmentioned that mechanisms are being devel-oped in that project for transferring attitudesand value judgments from source to tar-get by default, whatever the particular con-ceptual metaphor involved, obviating theneed for special mechanisms per conceptualmetaphor.

Connections to Reasoning about Beliefs

Little research into metaphor has taken intoaccount the fact that if a hearer wishesto understand what a speaker means bya particular metaphorical utterance, it isthe speaker’s beliefs about the target andsource domains that are important, not the

hearer’s. In effect, the metaphorical process-ing should occur within the speaker’s “beliefspace” (as perceived by the hearer). Relat-edly, metaphor can occur within the com-plement clauses of mental state verbs, as in“Mary believes that SnakeByte nursed Rab-bitWare back to health.” One interpretationof such a sentence is that the metaphori-cal conception of the target is Mary’s own(or rather, Mary’s own, as viewed by thespeaker), not (directly) the speaker’s. Inthis case, metaphorical processing should beembedded within a belief space for Mary(within a belief space of the speaker). Stern(2000) and van Dijk (1980) are rare inmetaphor research to have addressed thoseissues, albeit only in an abstract way.

The issue is important in the ATT-Metaproject. As well as handling metaphor, theATT-Meta system can perform reasoningabout agents’ beliefs and reasoning. Methodsare being developed for processing metaphorwithin the context of a specific agent’sbeliefs rather within the system’s own viewof reality. This involves embedding a pre-tence cocoon within a belief space for theagent.

Conversely, in personification metaphor,it can be necessary to reasoning about thebeliefs and reasoning of the entity that ismetaphorically viewed as a person. Thisinvolves, in ATT-Meta terms, embedding abelief space within a pretence cocoon.

Reversed Transfers

The ATT-Meta approach is unusual in advo-cating that “reverse transfers” – transfers ofinformation from target to source domain(more properly, reality to pretence) – areuseful in metaphor understanding. One rea-son is the reverse transfer of reasoningqueries that arise (notionally) from con-text. A query in target-domain terms canbe reversed-transferred to become query insource-domain terms, and an example wasgiven in the review of ATT-Meta above.This and two other reasons for doing reversetransfers are discussed at length in Barndenet al. (2004). One of them is based on anargument that, in the case of a conceptual


334 JOHN A. BARNDEN

metaphor being used in a distributed wayacross multiple utterances, it may be easierand more effective to form a coherent sce-nario in source-domain terms than to do sodirectly in target-domain terms by translat-ing each metaphorical utterance into target-domain terms. This approach can insteadinvolve “metaphorizing” the literal sentencesin the relevant discourse segment: translat-ing the information in them into source-domain terms. We present this possibilityas a potentially fruitful topic for futureresearch into metaphor.

Conclusion

AI is not just about the engineering of“intelligent” artefacts for useful purposes butalso about mapping out the space of pos-sible principles and mechanisms of cogni-tion, whether artificial or natural. For theEngineering aim, metaphor is an impor-tant challenge, and AI can draw here oninsights on the problem from many otherdisciplines. Conversely, through its non-Engineering aims, various features of AI –its partial applications focus, its input fromComputer Science, its need or ambition toproduce detailed processing accounts – putAI in a good position to help metaphorresearch. The help can consist of facilitat-ing certain types of advance, identifying cer-tain types of neglected problem, or effect-ing salutory changes of emphasis. This is notto say that these advances, problem identifi-cations, and emphasis shifts could not arisefrom other disciplines, but just that AI isespecially well-placed to generate them.

Specific helpful things that one can pointto already as coming out of AI researchon metaphor – whether they are advances,problem identifications or emphasis shifts –include the working out of detailed mech-anisms for source-domain reasoning, thedetailed elaboration of the alternative notionof within-pretence reasoning for metaphor,the casting of mappings as inference rules,the emphasis on and inclusion of gradednessin metaphor interpretation, mechanisms forexploiting context, the thorough inclusion

of uncertainty into metaphorical reason-ing, a richer view of overriding (source-over-target as well as target-over-source),important steps towards integration withmetonymy interpretation, some emphasison disanalogy, the usefulness of reversedtransfers, steps towards mechanisms for han-dling the default transfer of attitudes andvalue judgments, the importance of non-assertional metaphor, enriched doubt aboutdomains, and clarification and specificationof ways in which literal meaning can beinvolved in metaphor interpretation.

All these matters require much furtherresearch, within AI and outside. But letus celebrate the fact that metaphor is, parexcellence, an area for truly interdisciplinaryinvestigation!

Acknowledgments

This chapter draws on research supportedby grants GR/M64208 and EP/C538943 /1from the Engineering and Physical SciencesResearch Council of the United Kingdom. Iam indebted to the metaphor research groupin my department – Sheila Glasbey, MarkLee, and especially Alan Wallington – fortheir help with the development of this arti-cle, and to Ray Gibbs for comments andguidance.

References

Asher, N., & Lascarides, A. (1995 , March).Metaphor in discourse. In Proceedings of theAAAI Spring Symposium Series: Representationand Acquisition of Lexical Knowledge: Polysemy,Ambiguity, and Generativity (pp. 3–7). Stan-ford, CA.

Barnden, J. A. (1998). Combining uncertain beliefreasoning and uncertain metaphor-based rea-soning. In M. A. Gernsbacher & S. J. Derry(Eds.), Proceedings of the Twentieth AnnualMeeting of the Cognitive Science Society (pp. 114–119). Mahwah, NJ: Lawrence Erlbaum Asso-ciates.

Barnden, J. A. (2001a). Uncertainty and con?icthandling in the ATT-Meta context-based sys-tem for metaphorical reasoning. In V. Akman,



P. Bouquet, R. Thomason, & R. A. Young(Eds.), Proceedings of the Third InternationalConference on Modeling and Using Context: Vol.2 116. Lecture Notes in Artificial Intelligence (pp.15–29). Berlin: Springer.

Barnden, J. A. (2001b). Application of theATT-Meta metaphor-understanding approachto selected examples from Goatly (Techni-cal Report CSRP-01–01). Birmingham, UK:School of Computer Science, University ofBirmingham.

Barnden, J. A. (2001c). Application of the ATT-Meta metaphor-understanding approach to var-ious examples in the ATT-Meta project data-bank (Technical Report CSRP-01–02). Birm-ingham, UK: School of Computer Science,University of Birmingham.

Barnden, J. A. (2001d). The utility of reversedtransfers in metaphor. In J. D. Moore & K.Stenning (Eds.), Proceedings of the Twenty-Third Annual Meeting of the Cognitive ScienceSociety (pp. 57–62). Mahwah, NJ: LawrenceErlbaum Associates.

Barnden, J. A., Glasbey, S. R., Lee, M. G., &Wallington, A. M. (2002). Application of theATT-Meta metaphor-understanding system toexamples of the metaphorical view of TEACHERS

AS MIDWIVES (Technical Report CSRP-02–10). Birmingham, UK: School of ComputerScience, University of Birmingham.

Barnden, J. A., Glasbey, S. R., Lee, M. G.,& Wallington, A. M. (2003). Domain-transcending mappings in a system formetaphorical reasoning. In ConferenceCompanion to the 10th Conference of theEuropean Chapter of the Association forComputational Linguistics (EACL 2003)(pp. 57–61). Association for ComputationalLinguistics.

Barnden, J. A., Glasbey, S. R., Lee, M. G., &Wallington, A. M. (2004). Varieties and direc-tions of interdomain influence in metaphor.Metaphor and Symbol, 19(1), 1–30.

Barnden, J. A., Helmreich, S., Iverson, E., & Stein,G. C. (1994). An integrated implementationof simulative, uncertain and metaphorical rea-soning about mental states. In J. Doyle, E.Sandewall, & P. Torasso (Eds.), Principles ofknowledge representation and reasoning: Pro-ceedings of the Fourth International Conference(pp. 27–38). San Mateo, CA: Morgan Kauf-mann.

Barnden, J. A., Helmreich, S., Iverson, E., &Stein, G. C. (1996). Artificial intelligence andmetaphors of mind: within-vehicle reasoning

and its benefits. Metaphor and Symbolic Activ-ity, 11(2), 101–123 .

Barnden, J. A., & Lee, M. G. (1999). Animplemented context system that combinesbelief reasoning, metaphor-based reasoningand uncertainty handling. In P. Bouquet, P.Brezillon, & L. Serafini (Eds.), Second Inter-national and Interdisciplinary Conference onModeling and Using Context: Vol. 1688 Lec-ture Notes in Artificial Intelligence (pp. 28–41).Berlin: Springer.

Barnden, J. A., & Lee, M. G. (2001). Under-standing usages of conceptual metaphors: Anapproach and artificial intelligence system(Technical Report CSRP-01–05). Birming-ham, UK: School of Computer Science, Uni-versity of Birmingham.

Carbonell, J. G. (1980). Metaphor – key to exten-sible semantic analysis. In Proceedings of the18th Annual Meeting of the Association for Com-putational Linguistics (pp. 17–21).

Carbonell, J. G. (1982). Metaphor: An in-escapable phenomenon in natural-languagecomprehension. In W. Lehnert & M. Ringle(Eds.), Strategies for natural language process-ing (pp. 415–434). Hillsdale, NJ: Lawrence Erl-baum Associates.

Carston, R. (2002). Thoughts and utterances:The pragmatics of explicit communication.Oxford: Blackwell.

Delfino, M., & Manea, S. (2005 , April). Figura-tive language expressing emotion and motiva-tion in a web based learning environment. InProceedings of the Symposium on Agents ThatWant and Like: Motivational and EmotionalRoots of Cognition and Action (pp. 37–40), atAISB’05 Convention, University of Hertford-shire, UK. Brighton, UK: Society for the Studyof Artificial Intelligence and Simulation ofBehaviour.

Dirven, R., & Porings, R. (Eds.). (2002). Metaphorand metonymy in comparison and contrast.Berlin: Mouton de Gruyter.

Emanatian, E. (1995). Metaphor and the experi-ence of emotion: The value of cross-culturalperspectives. Metaphor and Symbolic Activity,10(3), 163–182 .

Fainsilber, L., & Ortony, A. (1987). Metaphori-cal uses of language in the expression of emo-tions. Metaphor and Symbolic Activity, 2 (4),239–250.

Falkenhainer, B., Forbus, K. D., & Gentner,D. (1989). The structure-mapping engine:Algorithm and examples. Artificial Intelligence,41(1), 1–63 .


336 JOHN A. BARNDEN

Fass, D. (1997). Processing metaphor andmetonymy. Greenwich, CT: Ablex.

Fauconnier, G., & Turner, M. (1998). Conceptualintegration networks. Cognitive Science, 2 2 (2),133–187.

Fussell, S. R., & Moss, M. M. (1998). Figurativelanguage in descriptions of emotional states.In S. R. Fussell & R. J. Kreuz (Eds.), Socialand Cognitive Approaches to Interpersonal Com-munication. Hillsdale, NJ: Lawrence ErlbaumAssociates.

Gentner, D. (1983). Structure-mapping: A the-oretical framework for analogy. Cognitive Sci-ence, 7(2), 95–119.

Gerrig, R. J. (1989). Empirical constraints oncomputational theories of metaphor: Com-ments on Indurkhya. Cognitive Science, 13(2),235–241.

Gibbs, R., & Tendahl, M. (2006). Cognitive effortand effects in metaphor comprehension: Rel-evance theory and psycholinguistics. Mind &Language, 2 1, 379–403 .

Giora, R. (1997). Understanding figurative andliteral language: The graded salience hypothe-sis. Cognitive Linguistics, 8(3), 183–206.

Goossens, L. (1990). Metaphtonymy: the inter-action of metaphor and metonymy in expres-sions for linguistic action. Cognitive Linguistics,1, 323–340.

Gross, L. (1994). Facing up to the dreadfuldangers of denial [U.S. ed.]. Cosmopolitan,2 16(3).

Hall, R. P. (1989). Computational approaches toanalogical reasoning: a comparative analysis.Artificial Intelligence, 39, 39–120.

Hintikka, J., & Sandu, G. (1990). Metaphor andthe varieties of lexical meaning. Dialectica,44(1–2), 55–78.

Hobbs, J. R. (1990). Literature and cognition (CSLILecture Notes No. 21). Stanford, CA: StanfordUniversity.

Hobbs, J. R. (1992). Metaphor and abduction.In A. Ortony, J. Slack, & O. Stock (Eds.),Communication from an Artificial Intelligenceperspective: Theoretical and applied issues(pp. 35–58). Berlin: Springer-Verlag.

Hobbs, J. R., Stickel, M. E., Appelt, D. E., &Martin, P. (1993). Interpretation as abduction.Artificial Intelligence, 63 , 69–142 .

Indurkhya, B. (1991). Modes of metaphor.Metaphor and Symbolic Activity, 6(1), 1–27.

Indurkhya, B. (1992). Metaphor and cognition: Aninteractionist approach. Dordrecht: Kluwer.

Iverson, E., & Helmreich, S. (1992). Metallel:An integrated approach to non-literal phrase

interpretation. Computational Intelligence,8(3), 477–493 .

Kittay, E. F. (1989). Metaphor: Its cognitiveforce and linguistic structure (Paperback ed.).Oxford: Clarendon Press.

Kovecses, Z. (2000). Metaphor and emotion: Lan-guage, culture, and body in human feeling. NewYork: Cambridge University Press.

Lakoff, G. (1993). The contemporary theory ofmetaphor. In A. Ortony (Ed.), Metaphor andthought (2nd ed.).(pp. 202–251) Cambridge:Cambridge University Press.

Lakoff, G., & Johnson, M. (1980). Metaphorswe live by. Chicago: University of ChicagoPress.

Lakoff, G., & Turner, M. (1989). More thancool reason: A field guide to poetic metaphor.Chicago: University of Chicago Press.

Lee, M. G., & Barnden, J. A. (2001a). Reasoningabout mixed metaphors with an implementedAI system. Metaphor and Symbol, 16(1&2), 29–42 .

Lee, M. G., & Barnden, J. A. (2001b). Men-tal metaphors from the master metaphorlist: Empirical examples and the appli-cation of the ATT-Meta system (Techni-cal Report CSRP-01–03). Birmingham, UK:School of Computer Science, University ofBirmingham.

Leezenberg, M. (1995). Contexts of metaphor.ILLC Dissertation Series, 1995–17, Institutefor Language, Logic and Computation, Uni-versity of Amsterdam, The Netherlands.

Levin, S. R. (1988). Metaphoric worlds. NewHaven, CT: Yale University Press.

Lytinen, S. L., Burridge, R. R., & Kirtner, J. D.(1992). The role of literal meaning in the com-prehension of non-literal constructions. Com-putational Intelligence, 8(3), 416–432 .

Martin, J. H. (1990). A computational model ofmetaphor interpretation. San Diego, CA: Aca-demic Press.

Martin, J. H. (1996). Computational approachesto figurative language. Metaphor and SymbolicActivity, 11, 85–100.

Martin, J. H. (2000). Representing UNIX domainmetaphors. Artificial Intelligence Review, 14(4–5), 377–401.

Mason, Z. J. (2004). CorMet: A computational,corpus-based conventional metaphor extrac-tion system. Computational Linguistics, 30(1),23–44 .

McNeill, D. (1992). Hand and mind: What ges-tures reveal about thought. Chicago: Universityof Chicago Press.



Musolff, A. (2004). Metaphor and politicaldiscourse: Analogical reasoning in debates aboutEurope. Basingstoke, UK: Palgrave Macmillan.

Narayanan, S. (1997). KARMA: Knowledge-basedaction representations for metaphor and aspect.Unpublished PhD thesis, Computer ScienceDivision, EECS Department, University ofCalifornia, Berkeley.

Narayanan, S. (1999). Moving right along: Acomputational model of metaphoric reason-ing about events. In Proceedings of the NationalConference on Artificial Intelligence (AAAI ‘99,pp. 121–128). AAAI Press.

Norvig, P. (1989). Marker passing as a weakmethod of text inferencing. Cognitive Science,13(4), 569–620.

Ortony, A. (1979). The role of similarity in similesand metaphors. In A. Ortony (Ed.), Metaphorand thought (pp. 186–201). Cambridge: Cam-bridge University Press.

Pearl, J. (1986). Fusion, propagation, and struc-turing in belief networks. Artificial Intelligence,2 9(3), 241–288.

Ruiz de Mendoza Ibanez, F. J. (1999). Fromsemantic underdetermination via metaphor andmetonymy to conceptual interaction (Series A:General & Theoretical Papers No. 492). LAUDLinguistic Agency.

Russell, S. W. (1976). Computer understanding ofmetaphorically used verbs. American JournalComputational Linguistics, 2 , 15–28.

Russell, S. W. (1985). Conceptual analysis ofpartial metaphor. In L. Steels & J. Camp-bell (Eds.), Progress in Artificial Intelligence(pp. 193–201). Chichester, UK: Ellis Horwood.

Russell, S. W. (1986). Information and experiencein metaphor: A perspective from computeranalysis. Metaphor and Symbolic Activity, 1(4),227–270.

Russell, S., & Norvig, P. (2002). ArtificialIntelligence: A modern approach. EnglewoodCliffs, NJ: Prentice Hall.

Sperber, D., & Wilson, D. (1995). Relevance: Com-munication and cognition (2nd ed.). Oxford:Blackwell.

Stern, J. (2000). Metaphor in context. Cambridge,MA: Bradford Books<th>/<th>MIT Press.

Thomas, O. (1969). Metaphor and related subjects.New York: Random House.

Tourangeau, R., & Sternberg, R. J. (1982). Under-standing and appreciating metaphors. Cogni-tion, 11, 203–244 .

Turner, M. (1987). Death is the mother of beauty:Mind, metaphor, criticism. Chicago: Universityof Chicago Press.

van Dijk, T. A. (1980). Formal semantics ofmetaphorical discourse. In M. K. L. Ching,M. C. Haley, & R. F. Lunsford (Eds.), Lin-guistic perspectives on literature (pp. 115–138).London: Routledge and Kegan Paul.

van Genabith, J. (2001). Metaphors, logic, andtype theory. Metaphor and Symbol, 16(1&2),43–57.

Veale, T. (1998). “Just in Time” analogicalmapping, an iterative-deepening approachto structure-mapping. In Proceedings of theThirteenth European Conference on ArtificialIntelligence (ECAI ‘98). New York: JohnWiley.

Veale, T., & Keane, M. T. (1992). Concep-tual scaffolding: a spatially founded mean-ing representation for metaphor comprehen-sion. Computational Intelligence, 8(3), 494–519.

Veale, T., & Keane, M. T. (1997). The compe-tence of sub-optimal structure mapping on‘hard’ analogies. In Proceedings of the Interna-tional Joint Conference on Artificial Intelligence.Nagoya, Japan.

Vogel, C. (2001). Dynamic semantics formetaphor. Metaphor and Symbol, 16(1&2), 59–74 .

Way, E. C. (1991). Knowledge representation andmetaphor. Dordrecht: Kluwer.

Weber, S. H. (1989). Figurative adjective-nouninterpretation in a structured connectionistnetwork. In Proceedings of the Eleventh AnnualConference of the Cognitive Science Society(pp. 204–211). Hillsdale, NJ: Lawrence Erl-baum Associates.

Weiner, J. (1984). A knowledge representationapproach to understanding metaphors. Com-putational Linguistics, 10(1), 1–14 .

Wilcox, P. P. (2004). A cognitive key: Metonymicand metaphorical mappings in ASL. CognitiveLinguistics, 15(2), 197–222 .

Wilcox, S. (2004). Cognitive iconicity: Concep-tual spaces, meaning, and gesture in signed lan-guage. Cognitive Linguistics, 15(2), 119–147.

Wilks, Y. (1978). Making preferences more active.Artificial Intelligence, 11, 197–223 .

Wilks, Y., Barnden, J., & Wang, J. (1991, August).Your metaphor or mine: Belief ascription andmetaphor interpretation. In Proceedings of the12 th International. Joint Conference on ArtificialIntelligence, Sydney, Australia (pp. 945–950).San Mateo, CA: Morgan Kaufmann.

Winston, P. H. (1979). Learning by creating andjustifying transfer frames. In P. H. Winston &D. Brown (Eds.), Artificial Intelligence: An MIT


338 JOHN A. BARNDEN

Perspective (Vol. 1, pp. 347–374). Cambridge,MA: MIT Press.

Woll, B. (1985).Visual imagery and metaphorin British Sign Language. In W. Paprott e &R. Dirven (Eds.), The ubiquity of metaphor:Metaphor in language and thought (pp. 601–628). Amsterdam: John Benjamins.

Yu, N. (1995). Metaphorical expressions ofanger and happiness in English and Chinese.

Metaphor and Symbolic Activity, 10(2), 59–92 .

Zhang, L., Barnden, J. A., & Hendley, R. J.(2005 , July). Affect detection and metaphorin e-drama: The first stage. In Proceedings ofthe Workshop on Narrative Learning Environ-ments, 12 th International Conference on Arti-ficial Intelligence in Education (AIED-2005),Amsterdam.

metaphor and artificial intelligence - school of computer...

Documents