compositional and lexical semantics • compositional semantics: the

Download Compositional and lexical semantics • Compositional semantics: the

Post on 20-Jan-2017

222 views

Category:

Documents

5 download

Embed Size (px)

TRANSCRIPT

  • Compositional and lexical semantics

    Compositional semantics: the constructionof meaning (generally expressed as logic)based on syntax.This lecture:

    Semantics with FS grammars Lexical semantics: the meaning of

    individual words.This lecture:

    lexical semantic relations and WordNet one technique for word sense

    disambiguation

    1

  • Simple compositional semantics in featurestructures

    Semantics is built up along with syntax Subcategorization slot filling instantiates

    syntax

    Formally equivalent to logicalrepresentations (below: predicate calculuswith no quantifiers)

    Alternative FS encodings possible

    2

  • Objective: obtain the following semantics forthey like fish:pron(x) (like v(x, y) fish n(y))Feature structure encoding:

    PRED and

    ARG1

    PRED pronARG1 1

    ARG2

    PRED and

    ARG1

    PRED like vARG1 1ARG2 2

    ARG2

    PRED fish nARG1 2

    3

  • Noun entry

    fish

    HEADCAT nounAGR

    COMP filledSPR filled

    SEM

    INDEX 1PRED fish nARG1 1

    Corresponds to fish(x) where the INDEXpoints to the characteristic variable of thenoun (that is x).The INDEX is unambiguous here, bute.g., picture(x, y) sheep(y)picture of sheep

    4

  • Verb entry

    like

    HEAD

    CAT verbAGR pl

    COMP

    HEADCAT noun

    COMP filledSEM

    INDEX 2

    SPR

    HEADCAT noun

    SEM INDEX 1

    SEM

    PRED like vARG1 1ARG2 2

    Linking syntax and semantics: ARG1 linkedto the SPRs index; ARG2 linked to theCOMP index.

    5

  • COMP filling rule

    HEAD 1COMP filledSPR 3

    SEM

    PRED andARG1 4ARG2 5

    HEAD 1COMP 2SPR 3SEM 4

    , 2

    COMP filledSEM 5

    As last time: object of the verb (DTR2)fills the COMP slot

    New: semantics on the mother is the andof the semantics of the dtrs

    6

  • Apply to like

    HEAD 1COMP filledSPR 3

    SEM

    PRED andARG1 4ARG2 5

    HEAD 1CAT verbAGR pl

    COMP 2

    HEAD[CAT noun

    ]

    COMP filledSEM 5

    [INDEX 6

    ]

    SPR 3

    HEAD[CAT noun

    ]

    SEM[INDEX 7

    ]

    SEM 4

    PRED like vARG1 7ARG2 6

    , 2

    7

  • Apply to like fish

    HEAD 1COMP filledSPR 3

    SEM

    PRED andARG1 4ARG2 5

    HEAD 1CAT verbAGR pl

    COMP 2

    HEADCAT nounAGR

    COMP filledSPR filledSEM 5

    INDEX 6PRED fish nARG1 6

    SPR 3

    HEAD[CAT noun

    ]

    SEM[INDEX 7

    ]

    SEM 4

    PRED like vARG1 7ARG2 6

    , 2

    like v(x, y) fish n(y)

    8

  • Logic in semantic representation

    Meaning representation for a sentence iscalled the logical form

    Standard approach to composition intheoretical linguistics is lambda calculus,building FOPC or higher orderrepresentation

    Representation above is impoverished butcan build FOPC in FSs

    Theorem provingGeneration: starting point is logical form,

    not string.

    9

  • Meaning postulates

    e.g.,x[bachelor(x) man(x)unmarried(x)] usable with compositional semantics and

    theorem provers

    e.g. from Kim is a bachelor, we canconstruct the LF

    bachelor(Kim)and then deduce

    unmarried(Kim)

    OK for narrow domains, but classicallexical semantic relations are more generallyuseful

    10

  • Lexical semantic relations

    Hyponymy: IS-A:

    (a sense of) dog is a hyponym of (a sense of)animal

    animal is a hypernym of dog hyponymy relationships form a taxonomyworks best for concrete nouns

    Meronomy: PART-OF e.g., arm is a meronymof body, steering wheel is a meronym of car(piece vs part)

    Synonymy e.g., aubergine/eggplantAntonymy e.g., big/little

    11

  • WordNet

    large scale, open source resource for English hand-constructedwordnets being built for other languages organized into synsets: synonym sets

    (near-synonyms)

    Overview of adj red:

    1. (43) red, reddish, ruddy,blood-red, carmine, cerise,cherry, cherry-red, crimson,ruby, ruby-red, scarlet --(having any of numerous brightor strong colors reminiscentof the color of blood orcherries or tomatoes or rubies)2. (8) red, reddish --((used of hair or fur) of areddish brown color; "red deer";reddish hair")

    12

  • Hyponymy in WordNet

    Sense 6big cat, cat

    => leopard, Panthera pardus=> leopardess=> panther

    => snow leopard, ounce, Panthera uncia=> jaguar, panther, Panthera onca,

    Felis onca=> lion, king of beasts, Panthera leo

    => lioness=> lionet

    => tiger, Panthera tigris=> Bengal tiger=> tigress

    => liger=> tiglon, tigon=> cheetah, chetah, Acinonyx jubatus=> saber-toothed tiger, sabertooth

    => Smiledon californicus=> false saber-toothed tiger

    13

  • Some uses of lexical semantics

    Semantic classification: e.g., for selectionalrestrictions (e.g., the object of eat has to besomething edible) and for named entityrecognition

    Shallow inference: X murdered Y impliesX killed Y etc

    Back-off to semantic classes in somestatistical approaches

    Word-sense disambiguationMachine Translation: if you cant translate a

    term, substitute a hypernym

    Query expansion: if a search doesnt returnenough results, one option is to replace anover-specific term with a hypernym

    14

  • Word sense disambiguation

    Needed for many applications, problematic forlarge domains. May depend on:

    frequency: e.g., diet: the food sense (orsenses) is much more frequent than theparliament sense (Diet of Wurms)

    collocations: e.g. striped bass (the fish) vsbass guitar: syntactically related or in awindow of words (latter sometimes calledcooccurrence). Generally one sense percollocation.

    selectional restrictions/preferences (e.g.,Kim eats bass, must refer to fish

    15

  • WSD techniques

    supervised learning: cf. POS tagging fromlecture 3. But sense-tagged corpora aredifficult to construct, algorithms need farmore data than POS tagging

    unsupervised learning (see below)Machine readable dictionaries (MRDs) selectional preferences: dont work very

    well by themselves, useful in combinationwith other techniques

    16

  • WSD by (almost) unsupervised learning

    Disambiguating plant (factory vs vegetationsenses):

    1. Find contexts in training corpus:sense training example

    ? company said that the plant is still operating? although thousands of plant and animal species? zonal distribution of plant life? company manufacturing plant is in Orlando

    etc

    2. Identify some seeds to disambiguate a fewuses. e.g., plant life for vegetation use (A)manufacturing plant for factory use (B):sense training example

    ? company said that the plant is still operating? although thousands of plant and animal speciesA zonal distribution of plant lifeB company manufacturing plant is in Orlando

    etc

    3. Train a decision list classifier on the SenseA/Sense B examples.

    17

  • reliability criterion sense

    8.10 plant life A7.58 manufacturing plant B6.27 animal within 10 words of plant A

    etc

    4. Apply the classifier to the training set andadd reliable examples to A and B sets.sense training example

    ? company said that the plant is still operatingA although thousands of plant and animal speciesA zonal distribution of plant lifeB company manufacturing plant is in Orlando

    etc

    5. Iterate the previous steps 3 and 4 untilconvergence

    6. Apply the classifier to the unseen test data

    one sense per discourse: can be used as anadditional refinement

    18

  • Yarowsky (1995): schematically

    Initial state

    ? ? ???

    ? ??

    ?? ??

    ? ??

    ?? ?

    ?

    ? ??

    ?

    ??

    ???? ?

    ?

    Seeds

    A A ???

    ? ??

    ??life

    A?

    ? BBmanu.

    ?? A

    ?

    ? A?

    ?

    ??

    ???? ?

    ?

    19

  • Iterating:

    A A ??A

    ? B?

    ??

    animal

    AA

    ? BB

    company

    ?? A

    ?

    ? A?

    B

    ??

    ???? ?

    ?

    Final:

    A A BBA

    A BB

    AA AA

    A BB

    AA A

    B

    A AB

    B

    AA

    ABBB B

    B

    20

  • Evaluation of WSD

    SENSEVAL and SENSEVAL-2competitions

    evaluate against WordNet baseline: pick most frequent sense hard

    to beat (but dont always know mostfrequent sense)

    human ceiling varies with wordsMT task: more objective but sometimes

    doesnt correspond to polysemy in sourcelanguage

    21

Recommended

View more >