lecture 21 computational lexical semantics topics features in nltk iii computational lexical...

Post on 05-Jan-2016

230 Views

Category:

Documents

7 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Lecture 21Computational Lexical Semantics

Lecture 21Computational Lexical Semantics

Topics Topics Features in NLTK III Computational Lexical Semantics Semantic Web USC

Readings:Readings: NLTK book Chapter 10

Text Chapters 20

April 3, 2013

CSCE 771 Natural Language Processing

– 2 – CSCE 771 Spring 2013

OverviewOverviewLast Time (Programming)Last Time (Programming)

Features in NLTK NL queries SQL NLTK support for Interpretations and Models Propositional and predicate logic support Prover9

TodayToday Last Lectures slides 25-29 Features in NLTK Computational Lexical Semantics

Readings: Readings: Text 19,20 NLTK Book: Chapter 10

Next Time: Computational Lexical Semantics IINext Time: Computational Lexical Semantics II

– 3 – CSCE 771 Spring 2013

Model Building in NLTK - Chapter 10continuedModel Building in NLTK - Chapter 10continued

Mace model builderMace model builder

lp = nltk.LogicParser()lp = nltk.LogicParser()

# install Mace4# install Mace4

config_mace4('c:\Python26\Lib\site-packages\prover9')config_mace4('c:\Python26\Lib\site-packages\prover9')

a3 = lp.parse('exists x.(man(x) & walks(x))')a3 = lp.parse('exists x.(man(x) & walks(x))')

c1 = lp.parse('mortal(socrates)')c1 = lp.parse('mortal(socrates)')

c2 = lp.parse('-mortal(socrates)')c2 = lp.parse('-mortal(socrates)')

mb = nltk.Mace(5)mb = nltk.Mace(5)

print mb.build_model(None, [a3, c1])print mb.build_model(None, [a3, c1])

TrueTrue

print mb.build_model(None, [a3, c2])print mb.build_model(None, [a3, c2])

TrueTrue

print mb.build_model(None, [c1, c2])print mb.build_model(None, [c1, c2])

FalseFalse

– 4 – CSCE 771 Spring 2013

>>> a4 = lp.parse('exists y. (woman(y) & all x. (man(x) -> >>> a4 = lp.parse('exists y. (woman(y) & all x. (man(x) -> love(x,y)))')love(x,y)))')

>>> a5 = lp.parse('man(adam)')>>> a5 = lp.parse('man(adam)')

>>> a6 = lp.parse('woman(eve)')>>> a6 = lp.parse('woman(eve)')

>>> g = lp.parse('love(adam,eve)')>>> g = lp.parse('love(adam,eve)')

>>> mc = nltk.MaceCommand(g, assumptions=[a4, a5, a6])>>> mc = nltk.MaceCommand(g, assumptions=[a4, a5, a6])

>>> mc.build_model()>>> mc.build_model()

TrueTrue

– 5 – CSCE 771 Spring 2013

10.4   The Semantics of English Sentences10.4   The Semantics of English SentencesPrinciple of compositionality --Principle of compositionality --

– 6 – CSCE 771 Spring 2013

Representing the λ-Calculus in NLTKRepresenting the λ-Calculus in NLTK

(33)(33)

a.a. (walk(x) chew_gum(x))∧(walk(x) chew_gum(x))∧

b.b. λλx.(walk(x) chew_gum(x))∧x.(walk(x) chew_gum(x))∧

c.c. \x.(walk(x) & chew_gum(x)) -- \x.(walk(x) & chew_gum(x)) -- the NLTK way!the NLTK way!

– 7 – CSCE 771 Spring 2013

Lambda0.pyLambda0.py

import nltkimport nltk

from nltk import load_parserfrom nltk import load_parser

lp = nltk.LogicParser()lp = nltk.LogicParser()

e = lp.parse(r'\x.(walk(x) & chew_gum(x))')e = lp.parse(r'\x.(walk(x) & chew_gum(x))')

print eprint e

\x.(walk(x) & chew_gum(x))\x.(walk(x) & chew_gum(x))

e.free()e.free()

print lp.parse(r'\x.(walk(x) & chew_gum(y))')print lp.parse(r'\x.(walk(x) & chew_gum(y))')

\x.(walk(x) & chew_gum(y))\x.(walk(x) & chew_gum(y))

– 8 – CSCE 771 Spring 2013

Simple β-reductionsSimple β-reductions

>>> e = lp.parse(r'\x.(walk(x) & chew_gum(x))(gerald)')>>> e = lp.parse(r'\x.(walk(x) & chew_gum(x))(gerald)')

>>> print e>>> print e

\x.(walk(x) & chew_gum(x))(gerald)\x.(walk(x) & chew_gum(x))(gerald)

>>> print e.simplify() [1]>>> print e.simplify() [1]

(walk(gerald) & chew_gum(gerald))(walk(gerald) & chew_gum(gerald))

– 9 – CSCE 771 Spring 2013

Predicate reductionsPredicate reductions

>>> e3 = lp.parse('\P.exists x.P(x)(\y.see(y, x))')>>> e3 = lp.parse('\P.exists x.P(x)(\y.see(y, x))')

>>> print e3>>> print e3

(\P.exists x.P(x))(\y.see(y,x))(\P.exists x.P(x))(\y.see(y,x))

>>> print e3.simplify()>>> print e3.simplify()

exists z1.see(z1,x)exists z1.see(z1,x)

– 10 – CSCE 771 Spring 2013

Figure 19.7 Inheritance of PropertiesFigure 19.7 Inheritance of Properties

Exists e,x,y Eating(e) ^ Agent(e, x) ^ Theme(e, y)Exists e,x,y Eating(e) ^ Agent(e, x) ^ Theme(e, y)

““hamburger edible?” from wordnethamburger edible?” from wordnet

Copyright ©2009 by Pearson Education, Inc.Upper Saddle River, New Jersey 07458

All rights reserved.

Speech and Language Processing, Second EditionDaniel Jurafsky and James H. Martin

– 11 – CSCE 771 Spring 2013

Figure 20.1 Possible sense tags for bassFigure 20.1 Possible sense tags for bassChapter 20 – Word Sense disambiguation (WSD)Chapter 20 – Word Sense disambiguation (WSD)

Machine translationMachine translation

Supervised vs unsupervised learningSupervised vs unsupervised learning

Semantic concordance – corpus with words tagged Semantic concordance – corpus with words tagged with sense tags with sense tags

– 12 – CSCE 771 Spring 2013

Feature Extraction for WSDFeature Extraction for WSD

Feature vectorsFeature vectors

CollocationCollocation

[w[wi-2i-2, POS, POSi-2i-2, w, wi-1i-1, POS, POSi-1i-1, w, wii, POS, POSii, w, wi+1i+1, POS, POSi+1i+1, w, wi+2i+2, POS, POSi+2i+2]]

Bag-of-words – unordered set of neighboring wordsBag-of-words – unordered set of neighboring words

Represent sets of most frequent content words with Represent sets of most frequent content words with membership vectormembership vector

[0,0,1,0,0,0,1] – set of 3[0,0,1,0,0,0,1] – set of 3rdrd and 7 and 7thth most freq. content word most freq. content word

Window of nearby words/featuresWindow of nearby words/features

– 13 – CSCE 771 Spring 2013

Naïve Bayes ClassifierNaïve Bayes Classifier

w – word vectorw – word vector

s – sense tag vectors – sense tag vector

f – feature vector [wf – feature vector [wii, POS, POSii ] for i=1, …n ] for i=1, …n

Approximate by frequency countsApproximate by frequency counts

But how practical?But how practical?

)|(maxarg^

fsPsSs

– 14 – CSCE 771 Spring 2013

Looking for Practical formulaLooking for Practical formula

..

Still not practicalStill not practical

)(

)()|(maxarg

)|(maxarg^

fP

sPsfP

fsPs

Ss

Ss

– 15 – CSCE 771 Spring 2013

Naïve == Assume IndependenceNaïve == Assume Independence

n

jj sfPsfP

1

)|()|(

Now practical, but realistic?Now practical, but realistic?

n

jj

SssfPs

1

^

)|(maxarg

– 16 – CSCE 771 Spring 2013

Training = count frequenciesTraining = count frequencies

..

Maximum likelihood estimator (20.8)Maximum likelihood estimator (20.8)

)(

),()|(

scount

sfcountsfP jij

)(

),()(

j

jii wcount

wscountsP

– 17 – CSCE 771 Spring 2013

Decision List ClassifiersDecision List Classifiers

Naïve Bayes Naïve Bayes hard for humans to examine decisions hard for humans to examine decisions and understandand understand

Decision list classifiers - like “case” statementDecision list classifiers - like “case” statement

sequence of (test, returned-sense-tag) pairssequence of (test, returned-sense-tag) pairs

– 18 – CSCE 771 Spring 2013

Figure 20.2 Decision List Classifier RulesFigure 20.2 Decision List Classifier Rules

– 19 – CSCE 771 Spring 2013

WSD Evaluation, baselines, ceilingsWSD Evaluation, baselines, ceilings

Extrinsic evaluation - evaluating embedded NLP in Extrinsic evaluation - evaluating embedded NLP in end-to-end applications (in vivo)end-to-end applications (in vivo)

Intrinsic evaluation – WSD evaluating by itself (in vitro)Intrinsic evaluation – WSD evaluating by itself (in vitro)

Sense accuracy Sense accuracy

Corpora – SemCor, SENSEVAL, SEMEVALCorpora – SemCor, SENSEVAL, SEMEVAL

Baseline - Most frequent sense (wordnet sense 1)Baseline - Most frequent sense (wordnet sense 1)

Ceiling – Gold standard – human experts with Ceiling – Gold standard – human experts with discussion and agreement discussion and agreement

– 20 – CSCE 771 Spring 2013

Figure 20.3 Simplified Lesk AlgorithmFigure 20.3 Simplified Lesk Algorithm

gloss/sentence overlap

– 21 – CSCE 771 Spring 2013

Simplified Lesk exampleSimplified Lesk example

The bank can guarantee deposits will eventually cover The bank can guarantee deposits will eventually cover future tuition costs because it invests in adjustable future tuition costs because it invests in adjustable rate mortgage securities.rate mortgage securities.

– 22 – CSCE 771 Spring 2013

SENSEVAL competitions SENSEVAL competitions

http://www.senseval.org/

Check the Senseval-3 website.

– 23 – CSCE 771 Spring 2013

Corpus LeskCorpus Lesk

weights applied to overlap wordsweights applied to overlap words

inverse document frequencyinverse document frequency

idfidfii = log (N = log (Ndocsdocs / num docs containing w / num docs containing wii))

– 24 – CSCE 771 Spring 2013

20.4.2 Selectional Restrictions and Preferences20.4.2 Selectional Restrictions and Preferences

– 25 – CSCE 771 Spring 2013

Wordnet Semantic classes of ObjectsWordnet Semantic classes of Objects

– 26 – CSCE 771 Spring 2013

Minimally Supervised WSD: BootstrappingMinimally Supervised WSD: Bootstrapping

Yarowsky algorithmYarowsky algorithm

Heuritics:Heuritics:

1.1. one sense per collocationsone sense per collocations

2.2. one sense per discourseone sense per discourse

– 27 – CSCE 771 Spring 2013

Figure 20.4 Two senses of plantFigure 20.4 Two senses of plant

– 28 – CSCE 771 Spring 2013

Figure 20.5Figure 20.5

– 29 – CSCE 771 Spring 2013

Figure 20.6 Path Based SimilarityFigure 20.6 Path Based Similarity

– 30 – CSCE 771 Spring 2013

Figure 20.6 Path Based SimilarityFigure 20.6 Path Based Similarity

..

\\

simsimpathpath(c(c11, c, c22)= 1/pathlen(c)= 1/pathlen(c11, c, c22) (length + 1)) (length + 1)

– 31 – CSCE 771 Spring 2013

Information Content word similarityInformation Content word similarity

– 32 – CSCE 771 Spring 2013

Figure 20.7 Wordnet with P(c) valuesFigure 20.7 Wordnet with P(c) values

– 33 – CSCE 771 Spring 2013

Figure 20.8Figure 20.8

– 34 – CSCE 771 Spring 2013

– 35 – CSCE 771 Spring 2013

Figure 20.9Figure 20.9

– 36 – CSCE 771 Spring 2013

Figure 20.10Figure 20.10

– 37 – CSCE 771 Spring 2013

Figure 20.11Figure 20.11

– 38 – CSCE 771 Spring 2013

Figure 20.12Figure 20.12

– 39 – CSCE 771 Spring 2013

Figure 20.13Figure 20.13

– 40 – CSCE 771 Spring 2013

Figure 20.14Figure 20.14

– 41 – CSCE 771 Spring 2013

Figure 20.15Figure 20.15

– 42 – CSCE 771 Spring 2013

Figure 20.16Figure 20.16

top related