personalized pagerank over wordnet for similarity and word...

66
Personalized PageRank over WordNet for Similarity and Word Sense Disambiguation Eneko Agirre [email protected] (joint work with Aitor Soroa, some slides from Enrique Alfonseca) University of the Basque Country (Currently visiting Stanford) Google, 2009 Agirre (UBC) Personalized PageRank over WordNet Google 2009 1 / 40

Upload: others

Post on 26-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Personalized PageRank over WordNetfor Similarity and Word Sense Disambiguation

Eneko [email protected]

(joint work with Aitor Soroa, some slides from Enrique Alfonseca)

University of the Basque Country(Currently visiting Stanford)

Google, 2009

Agirre (UBC) Personalized PageRank over WordNet Google 2009 1 / 40

Page 2: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Summary

Present an integrated software based on Knowledge Bases (e.g.WordNet) for:

Similarity of word pairsDisambiguate words with respect to knowledge base concepts (aka WordSense Disambiguation)

Excellent results (EACL, NAACL, IJCAI 2009)Open source: http://ixa2.si.ehu.es/ukb/

Agirre (UBC) Personalized PageRank over WordNet Google 2009 2 / 40

Page 3: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Outline

1 Introduction

2 WordNet, PageRank and Personalized PageRank

3 PPR for similarity [Agirre et al.2009b]

4 PPR for WSD [Agirre and Soroa2009]

5 PPR and WSD on specific domains [Agirre et al.2009a]

6 Conclusions

Agirre (UBC) Personalized PageRank over WordNet Google 2009 3 / 40

Page 4: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Similarity

Measuring semantic similarity and relatednessare well studied problems in lexical semantics:

Given two words or multiword-expressions,estimate how similar or related they are.Relatedness is a more general relationship,including topical relatedness or meronymy.Typically implemented as calculating a numeric value ofsimilarity/relatedness.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 4 / 40

Page 5: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Similarity examples

RG dataset WordSim353 datasetcord smile 0.02 king cabbage 0.23

rooster voyage 0.04 professor cucumber 0.31noon string 0.04 ...

... investigation effort 4.59glass jewel 1.78 smart student 4.62

magician oracle 1.82 ...... movie star 7.38

cushion pillow 3.84 ...cemetery graveyard 3.88 journey voyage 9.29

automobile car 3.92 midday noon 9.29midday noon 3.94 fuck sex 9.44

gem jewel 3.94 tiger tiger 10.00

Agirre (UBC) Personalized PageRank over WordNet Google 2009 5 / 40

Page 6: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Similarity

Two main approaches:Knowledge-based (Roget’s Thesaurus, WordNet, etc.)Corpus-based, also known as distributional similarity (co-occurrences)

Many potential applications, overcome brittleness (word match),specially in very short texts, information retrieval, textual entailment,machine translation.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 6 / 40

Page 7: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Similarity

Two main approaches:Knowledge-based (Roget’s Thesaurus, WordNet, etc.)Corpus-based, also known as distributional similarity (co-occurrences)

Many potential applications, overcome brittleness (word match),specially in very short texts, information retrieval, textual entailment,machine translation.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 6 / 40

Page 8: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Word Sense Disambiguation (WSD)

Goal: determine the senses of the words in a text.“. . . but the location on the south bank of the Thames estuary.”“. . . cash includes cheque payments, bank transfers . . . ”

Dictionary (e.g. WordNet):bank#1 sloping land, especially the slope beside a body of water.bank#2 a financial institution that accepts deposits and. . .bank#3 an arrangement of similar objects in row or in tiers.bank#4 a long ridge or pile.. . . (10 senses total)

Many potential applications, enable natural language understanding, linktext to knowledge base, deploy semantic web.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 7 / 40

Page 9: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Word Sense Disambiguation (WSD)

Goal: determine the senses of the words in a text.“. . . but the location on the south bank of the Thames estuary.”“. . . cash includes cheque payments, bank transfers . . . ”

Dictionary (e.g. WordNet):bank#1 sloping land, especially the slope beside a body of water.bank#2 a financial institution that accepts deposits and. . .bank#3 an arrangement of similar objects in row or in tiers.bank#4 a long ridge or pile.. . . (10 senses total)

Many potential applications, enable natural language understanding, linktext to knowledge base, deploy semantic web.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 7 / 40

Page 10: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Word Sense Disambiguation (WSD)

Goal: determine the senses of the words in a text.“. . . but the location on the south bank of the Thames estuary.”“. . . cash includes cheque payments, bank transfers . . . ”

Dictionary (e.g. WordNet):bank#1 sloping land, especially the slope beside a body of water.bank#2 a financial institution that accepts deposits and. . .bank#3 an arrangement of similar objects in row or in tiers.bank#4 a long ridge or pile.. . . (10 senses total)

Many potential applications, enable natural language understanding, linktext to knowledge base, deploy semantic web.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 7 / 40

Page 11: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Word Sense Disambiguation (WSD)

Supervised corpus-based WSD performs bestTrain classifiers on hand-tagged data (typically SemCor)Data sparseness, e.g. bank 48 examples (25,20,2,1,0. . . )Results decrease when train/test from different sources (even Brown, BNC)Decrease even more when train/test from different domains

Knowledge-based WSDUses information in a KB (WordNet)Performs close to but lower than Most Frequent SenseVocabulary coverageRelation coverageBut . . .

Agirre (UBC) Personalized PageRank over WordNet Google 2009 8 / 40

Page 12: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Word Sense Disambiguation (WSD)

Supervised corpus-based WSD performs bestTrain classifiers on hand-tagged data (typically SemCor)Data sparseness, e.g. bank 48 examples (25,20,2,1,0. . . )Results decrease when train/test from different sources (even Brown, BNC)Decrease even more when train/test from different domains

Knowledge-based WSDUses information in a KB (WordNet)Performs close to but lower than Most Frequent SenseVocabulary coverageRelation coverageBut . . .

Agirre (UBC) Personalized PageRank over WordNet Google 2009 8 / 40

Page 13: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Domain adaptation

Deploying NLP techniques in real applications is challenging, specially forWSD:

Sense distributions change across domainsData sparseness hurts moreContext overlap is reducedNew senses, new terms

But. . .Some words get less interpretations in domains:bank in finance, coach in sports

Agirre (UBC) Personalized PageRank over WordNet Google 2009 9 / 40

Page 14: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Domain adaptation

Deploying NLP techniques in real applications is challenging, specially forWSD:

Sense distributions change across domainsData sparseness hurts moreContext overlap is reducedNew senses, new terms

But. . .Some words get less interpretations in domains:bank in finance, coach in sports

Agirre (UBC) Personalized PageRank over WordNet Google 2009 9 / 40

Page 15: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Similarity and WSD

If using knowledge-bases, both WSD and Similarity are closely intertwined:Similarity between words based on similarity between senses (implicitlydoing disambiguation)WSD uses similarity of senses to context, or similarity between senses incontext

Agirre (UBC) Personalized PageRank over WordNet Google 2009 10 / 40

Page 16: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Introduction

Outline

1 Introduction

2 WordNet, PageRank and Personalized PageRank

3 PPR for similarity [Agirre et al.2009b]

4 PPR for WSD [Agirre and Soroa2009]

5 PPR and WSD on specific domains [Agirre et al.2009a]

6 Conclusions

Agirre (UBC) Personalized PageRank over WordNet Google 2009 11 / 40

Page 17: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

WordNet, PageRank and Personalized PageRank

Outline

1 Introduction

2 WordNet, PageRank and Personalized PageRank

3 PPR for similarity [Agirre et al.2009b]

4 PPR for WSD [Agirre and Soroa2009]

5 PPR and WSD on specific domains [Agirre et al.2009a]

6 Conclusions

Agirre (UBC) Personalized PageRank over WordNet Google 2009 12 / 40

Page 18: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

WordNet, PageRank and Personalized PageRank

Wordnet

Most widely used hierarchically organized lexical database for English(Fellbaum, 1998)Broad coverage of nouns, verbs, adjectives, adverbsMain unit: synset (concept)

depository financial institution, bank#2, banking companya financial institution that accepts deposits and. . .

Relations between concepts:synonymy (built-in), hyperonymy, antonymy, meronymy, entailment,derivation, glossClosely linked versions in several languages

Agirre (UBC) Personalized PageRank over WordNet Google 2009 13 / 40

Page 19: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

WordNet, PageRank and Personalized PageRank

Wordnet

Example of hypernym relations:

bankfinancial institution, financial organization

organizationsocial group

group, groupingabstraction, abstract entity

entity

Representing WordNet as a graph:Nodes represent conceptsEdges represent relations (undirected)In addition, directed edges from words to corresponding concepts(senses)

Agirre (UBC) Personalized PageRank over WordNet Google 2009 14 / 40

Page 20: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

WordNet, PageRank and Personalized PageRank

PageRank

Given a graph, ranks nodes according to their relative structuralimportanceIf an edge from ni to nj exists, a vote from ni to nj is produced

Strength depends on the rank of niThe more important ni is, the more strength its votes will have.

PageRank can also be viewed as the result of a random walk processRank of ni represents the probability of a random walk over the graph endingon ni, at a sufficiently large time.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 15 / 40

Page 21: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

WordNet, PageRank and Personalized PageRank

PageRank

G: graph with N nodes n1, . . . , nN

di: outdegree of node iM: N × N matrix

Mji =

1di

an edge from i to j exists

0 otherwise

PageRank equation:Pr = cMPr + (1− c)v

voting schemea surfer randomly jumping to any node without following any paths on thegraph

c: damping factor: the way in which these two terms are combined at eachstep

Agirre (UBC) Personalized PageRank over WordNet Google 2009 16 / 40

Page 22: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

WordNet, PageRank and Personalized PageRank

PageRank

G: graph with N nodes n1, . . . , nN

di: outdegree of node iM: N × N matrix

Mji =

1di

an edge from i to j exists

0 otherwise

PageRank equation:Pr = cMPr + (1− c)v

voting schemea surfer randomly jumping to any node without following any paths on thegraph

c: damping factor: the way in which these two terms are combined at eachstep

Agirre (UBC) Personalized PageRank over WordNet Google 2009 16 / 40

Page 23: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

WordNet, PageRank and Personalized PageRank

PageRank

G: graph with N nodes n1, . . . , nN

di: outdegree of node iM: N × N matrix

Mji =

1di

an edge from i to j exists

0 otherwise

PageRank equation:Pr = cMPr + (1− c)v

voting schemea surfer randomly jumping to any node without following any paths on thegraph

c: damping factor: the way in which these two terms are combined at eachstep

Agirre (UBC) Personalized PageRank over WordNet Google 2009 16 / 40

Page 24: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

WordNet, PageRank and Personalized PageRank

PageRank

G: graph with N nodes n1, . . . , nN

di: outdegree of node iM: N × N matrix

Mji =

1di

an edge from i to j exists

0 otherwise

PageRank equation:Pr = cMPr + (1− c)v

voting schemea surfer randomly jumping to any node without following any paths on thegraph

c: damping factor: the way in which these two terms are combined at eachstep

Agirre (UBC) Personalized PageRank over WordNet Google 2009 16 / 40

Page 25: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

WordNet, PageRank and Personalized PageRank

PageRank

G: graph with N nodes n1, . . . , nN

di: outdegree of node iM: N × N matrix

Mji =

1di

an edge from i to j exists

0 otherwise

PageRank equation:Pr = cMPr + (1− c)v

voting schemea surfer randomly jumping to any node without following any paths on thegraph

c: damping factor: the way in which these two terms are combined at eachstep

Agirre (UBC) Personalized PageRank over WordNet Google 2009 16 / 40

Page 26: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

WordNet, PageRank and Personalized PageRank

Personalized PageRank

Pr = cMPr + (1− c)v

PageRank: v is a stochastic normalized vector, with elements 1N

Equal probabilities to all nodes in case of random jumps

Personalized PageRank, non-uniform v [Haveliwala2002]Assign stronger probabilities to certain kinds of nodesBias PageRank to prefer these nodes

For ex. if we concentrate all mass on node iAll random jumps return to niRank of i will be highHigh rank of i will make all the nodes in its vicinity also receive a high rankImportance of node i given by the initial v spreads along the graph

Agirre (UBC) Personalized PageRank over WordNet Google 2009 17 / 40

Page 27: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

WordNet, PageRank and Personalized PageRank

Personalized PageRank

Pr = cMPr + (1− c)v

PageRank: v is a stochastic normalized vector, with elements 1N

Equal probabilities to all nodes in case of random jumps

Personalized PageRank, non-uniform v [Haveliwala2002]Assign stronger probabilities to certain kinds of nodesBias PageRank to prefer these nodes

For ex. if we concentrate all mass on node iAll random jumps return to niRank of i will be highHigh rank of i will make all the nodes in its vicinity also receive a high rankImportance of node i given by the initial v spreads along the graph

Agirre (UBC) Personalized PageRank over WordNet Google 2009 17 / 40

Page 28: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for similarity [Agirre et al.2009b]

Outline

1 Introduction

2 WordNet, PageRank and Personalized PageRank

3 PPR for similarity [Agirre et al.2009b]

4 PPR for WSD [Agirre and Soroa2009]

5 PPR and WSD on specific domains [Agirre et al.2009a]

6 Conclusions

Agirre (UBC) Personalized PageRank over WordNet Google 2009 18 / 40

Page 29: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for similarity [Agirre et al.2009b]

PPR for similarity [Agirre et al.2009b]

Based on [Hughes and Ramage2007]Given a pair of words (w1, w2),

Initialize teleport probability mass on either w1 or w2.Run PPR

The similarity is given by the cosine of the two PPR vectors.Experiment settings:

Damping value c = 0.85Calculations finish after 30 iterations

Variations for Knowledge Base:MCR (WordNet 1.6, closely linked to Spanish WordNet) and WordNet 3.0All WordNet relations, All WN+gloss relations

Agirre (UBC) Personalized PageRank over WordNet Google 2009 19 / 40

Page 30: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for similarity [Agirre et al.2009b]

Datasets

Rubenstein and Goodenough (1965)80 word pairs, judged by 51 human subjectsScale 0 to 4 based on their similarityRedone for a subset by Miller and Charles (1991)

WordSim353 dataset:Finkelstein et al. (2002)353 word pairs, each with 13-16 human judgmentsAnnotators were asked to rate similarity and relatedness.

Results given by rank correlation of system output with human ratings(Spearman)

Agirre (UBC) Personalized PageRank over WordNet Google 2009 20 / 40

Page 31: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for similarity [Agirre et al.2009b]

ResultsCompetition with 1.6Twords distributional thesaurus in Google.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 21 / 40

Page 32: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for similarity [Agirre et al.2009b]

Results

Unknown words in WordNet

Agirre (UBC) Personalized PageRank over WordNet Google 2009 22 / 40

Page 33: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for similarity [Agirre et al.2009b]

ResultsState-of-the-art on MC (subset of RG)

Agirre (UBC) Personalized PageRank over WordNet Google 2009 23 / 40

Page 34: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for similarity [Agirre et al.2009b]

Results

State-of-the-art on WordSim 353

Method Source Spearman[Strube and Ponzetto2006] Wikipedia 0.19–0.48[Jarmasz2003] WordNet 0.33–0.35[Jarmasz2003] Roget’s 0.55[Hughes and Ramage2007] WordNet 0.55[Finkelstein et al.2002] Web corpus, WN 0.56[Gabrilovich and Markovitch2007] ODP 0.65[Gabrilovich and Markovitch2007] Wikipedia 0.75Personalized PageRank WordNet 0.66 (0.69)

Agirre (UBC) Personalized PageRank over WordNet Google 2009 24 / 40

Page 35: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for similarity [Agirre et al.2009b]

Cross-lingual evaluation

Consider pairs of words from different languages.Can we predict the similarities?

WordNet-based method:English WordNet graph, crosslingual lexical entries in synsets.Personalized PageRank is calculated in the same way

Contextual method:Get the top 5 translations of the non-English word into English using theGoogle Machine Translation system.Generate the context vectors for those 5 translations separately.Add the vectors.The rest of the procedure is the same.

Evaluation:RG and WordSim353One of the words in each pair translated into Spanish

Agirre (UBC) Personalized PageRank over WordNet Google 2009 25 / 40

Page 36: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for similarity [Agirre et al.2009b]

Cross-lingual evaluation

Agirre (UBC) Personalized PageRank over WordNet Google 2009 26 / 40

Page 37: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Outline

1 Introduction

2 WordNet, PageRank and Personalized PageRank

3 PPR for similarity [Agirre et al.2009b]

4 PPR for WSD [Agirre and Soroa2009]

5 PPR and WSD on specific domains [Agirre et al.2009a]

6 Conclusions

Agirre (UBC) Personalized PageRank over WordNet Google 2009 27 / 40

Page 38: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Knowledge-based WSD

Use information in WordNet for disambiguation:“. . . cash includes cheque payments, bank transfers . . . ”

Traditional approach [Patwardhan et al.2007]:Compare each target sense of bank with those of the words in the contextUsing semantic relatedness between pairs of sensesCombinatorial explosion: each word disambiguated individually

sim(bank#1,cheque#1) + sim(bank#1,cheque#2) + sim(bank#1,payment#1) . . .sim(bank#2,cheque#1) + sim(bank#2,cheque#2) + sim(bank#2,payment#1) . . .. . .

Graph-based methodsExploit the structural properties of the graph underlying WordNetFind globally optimal solutionsDisambiguate large portions of text in one goPrincipled solution to combinatorial explosion

Agirre (UBC) Personalized PageRank over WordNet Google 2009 28 / 40

Page 39: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Knowledge-based WSD

Use information in WordNet for disambiguation:“. . . cash includes cheque payments, bank transfers . . . ”

Traditional approach [Patwardhan et al.2007]:Compare each target sense of bank with those of the words in the contextUsing semantic relatedness between pairs of sensesCombinatorial explosion: each word disambiguated individually

sim(bank#1,cheque#1) + sim(bank#1,cheque#2) + sim(bank#1,payment#1) . . .sim(bank#2,cheque#1) + sim(bank#2,cheque#2) + sim(bank#2,payment#1) . . .. . .

Graph-based methodsExploit the structural properties of the graph underlying WordNetFind globally optimal solutionsDisambiguate large portions of text in one goPrincipled solution to combinatorial explosion

Agirre (UBC) Personalized PageRank over WordNet Google 2009 28 / 40

Page 40: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Using PageRank for WSD

Given a graph representation of the LKBPageRank over the whole WordNet would get a context-independentranking of word sensesWe would like:

Given an input text, disambiguate all open-class words in the input taking therest as context

Two alternatives1 Create a context-sensitive subgraph and apply PageRank over it

[Navigli and Lapata2007, Agirre and Soroa2008]2 Use Personalized PageRank over the complete graph, initializing v with the

context words

Agirre (UBC) Personalized PageRank over WordNet Google 2009 29 / 40

Page 41: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Using PageRank for WSD

Given a graph representation of the LKBPageRank over the whole WordNet would get a context-independentranking of word sensesWe would like:

Given an input text, disambiguate all open-class words in the input taking therest as context

Two alternatives1 Create a context-sensitive subgraph and apply PageRank over it

[Navigli and Lapata2007, Agirre and Soroa2008]2 Use Personalized PageRank over the complete graph, initializing v with the

context words

Agirre (UBC) Personalized PageRank over WordNet Google 2009 29 / 40

Page 42: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Using Personalized PageRank (Ppr and Ppr w2w)

For each word Wi, i = 1 . . . m in the contextInitialize v with uniform probabilities over words WiContext words act as source nodes injecting mass into the concept graphRun Personalized PageRankChoose highest ranking sense for target word

Problem of PprSenses of the same word might be linkedThose senses would reinforce each other and receive higher ranks

Ppr w2w alternative:Let the surrounding words decide which concept associated to Wi has morerelevanceFor each target word Wi, concentrate the initial probability mass in wordssurrounding Wi, but not in Wi itselfRun Personalized PageRank for each word in turn (higher cost)

Agirre (UBC) Personalized PageRank over WordNet Google 2009 30 / 40

Page 43: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Using Personalized PageRank (Ppr and Ppr w2w)

For each word Wi, i = 1 . . . m in the contextInitialize v with uniform probabilities over words WiContext words act as source nodes injecting mass into the concept graphRun Personalized PageRankChoose highest ranking sense for target word

Problem of PprSenses of the same word might be linkedThose senses would reinforce each other and receive higher ranks

Ppr w2w alternative:Let the surrounding words decide which concept associated to Wi has morerelevanceFor each target word Wi, concentrate the initial probability mass in wordssurrounding Wi, but not in Wi itselfRun Personalized PageRank for each word in turn (higher cost)

Agirre (UBC) Personalized PageRank over WordNet Google 2009 30 / 40

Page 44: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Using Personalized PageRank (Ppr and Ppr w2w)

For each word Wi, i = 1 . . . m in the contextInitialize v with uniform probabilities over words WiContext words act as source nodes injecting mass into the concept graphRun Personalized PageRankChoose highest ranking sense for target word

Problem of PprSenses of the same word might be linkedThose senses would reinforce each other and receive higher ranks

Ppr w2w alternative:Let the surrounding words decide which concept associated to Wi has morerelevanceFor each target word Wi, concentrate the initial probability mass in wordssurrounding Wi, but not in Wi itselfRun Personalized PageRank for each word in turn (higher cost)

Agirre (UBC) Personalized PageRank over WordNet Google 2009 30 / 40

Page 45: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Experiment setting

Two datasetsSenseval 2 All Words (S2AW)Senseval 3 All Words (S3AW)

Both labelled with WordNet 1.7 tagsCreate input contexts of at least 20 words

Adding sentences immediately before and after if original too short

PageRank settings:Damping factor (c): 0.85End after 30 iterations

Agirre (UBC) Personalized PageRank over WordNet Google 2009 31 / 40

Page 46: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Results and comparison to related work (S2AW)

(Mihalcea, 2005) Pairwise Lesk between senses, then PageRank.(Sinha & Mihalcea, 2007) Several similarity measures, voting, fine-tuning for

each PoS. Development over S3AW.(Tsatsaronis et al., 2007) Subgraph BFS over WordNet 1.7 and eXtended WN,

then spreading activation.* No statistical significance (small dataset).

Senseval-2 All Words datasetSystem All N V Adj. Adv.Mih05 54.2 57.5 36.5 56.7 70.9Sihna07 56.4 65.6 32.3 61.4 60.2Tsatsa07 49.2 – – – –Ppr 56.8 71.1 33.4 55.9 67.1Ppr w2w 58.6 70.4 38.9 58.3 70.1MFS 60.1 71.2 39.0 61.1 75.4

Agirre (UBC) Personalized PageRank over WordNet Google 2009 32 / 40

Page 47: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Results and comparison to related work (S2AW)

(Mihalcea, 2005) Pairwise Lesk between senses, then PageRank.(Sinha & Mihalcea, 2007) Several similarity measures, voting, fine-tuning for

each PoS. Development over S3AW.(Tsatsaronis et al., 2007) Subgraph BFS over WordNet 1.7 and eXtended WN,

then spreading activation.* No statistical significance (small dataset).

Senseval-2 All Words datasetSystem All N V Adj. Adv.Mih05 54.2 57.5 36.5 56.7 70.9Sihna07 56.4 65.6 32.3 61.4 60.2Tsatsa07 49.2 – – – –Ppr 56.8 71.1 33.4 55.9 67.1Ppr w2w 58.6 70.4 38.9 58.3 70.1MFS 60.1 71.2 39.0 61.1 75.4

Agirre (UBC) Personalized PageRank over WordNet Google 2009 32 / 40

Page 48: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Comparison to related work (S3AW)

(Mihalcea, 2005) Pairwise Lesk between senses, then PageRank.(Sinha & Mihalcea, 2007) Several simmilarity measures, voting, fine-tuning for

each PoS. Development over S3AW.(Navigli & Lapata, 2007) Subgraph DFS(3) over WordNet 2.0 plus proprietary

relations, several centrality algorithms.(Navigli & Velardi, 2005) SSI algorithm on WordNet 2.0 plus proprietary

relations. Uses MFS when undecided.

System All N V Adj. Adv.Mih05 52.2 - - - -Sihna07 52.4 60.5 40.6 54.1 100.0Nav07 - 61.9 36.1 62.8 -Ppr 56.1 62.6 46.0 60.8 92.9Ppr w2w 57.4 64.1 46.9 62.6 92.9MFS 62.3 69.3 53.6 63.7 92.9Nav05 60.4 - - - -

Agirre (UBC) Personalized PageRank over WordNet Google 2009 33 / 40

Page 49: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR for WSD [Agirre and Soroa2009]

Comparison to related work (S3AW)

(Mihalcea, 2005) Pairwise Lesk between senses, then PageRank.(Sinha & Mihalcea, 2007) Several simmilarity measures, voting, fine-tuning for

each PoS. Development over S3AW.(Navigli & Lapata, 2007) Subgraph DFS(3) over WordNet 2.0 plus proprietary

relations, several centrality algorithms.(Navigli & Velardi, 2005) SSI algorithm on WordNet 2.0 plus proprietary

relations. Uses MFS when undecided.

System All N V Adj. Adv.Mih05 52.2 - - - -Sihna07 52.4 60.5 40.6 54.1 100.0Nav07 - 61.9 36.1 62.8 -Ppr 56.1 62.6 46.0 60.8 92.9Ppr w2w 57.4 64.1 46.9 62.6 92.9MFS 62.3 69.3 53.6 63.7 92.9Nav05 60.4 - - - -

Agirre (UBC) Personalized PageRank over WordNet Google 2009 33 / 40

Page 50: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR and WSD on specific domains [Agirre et al.2009a]

Outline

1 Introduction

2 WordNet, PageRank and Personalized PageRank

3 PPR for similarity [Agirre et al.2009b]

4 PPR for WSD [Agirre and Soroa2009]

5 PPR and WSD on specific domains [Agirre et al.2009a]

6 Conclusions

Agirre (UBC) Personalized PageRank over WordNet Google 2009 34 / 40

Page 51: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR and WSD on specific domains [Agirre et al.2009a]

Dataset [Koeling et al.2005]

Examples from BNC, Sports and Finances sections Reuters41 nouns: salient in either domain or with senses linked to these domainsSense inventory: WordNet v. 1.7.1

300 examples for each of the 41 nounsRoughly 100 examples from each word and corpus

Freely available

Agirre (UBC) Personalized PageRank over WordNet Google 2009 35 / 40

Page 52: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR and WSD on specific domains [Agirre et al.2009a]

Methods

What would happen if we apply PPR-based WSD to specific domains?

Personalized PageRank over context“. . . has never won a league title as coach but took Parma tosuccess. . . ”

Personalized PageRank over related wordsGet related words from distributional thesaurus [Koeling et al.2005]coach: manager, captain, player, team, striker, . . .

Experiments on BNC, Sports, Finance dataset:Supervised: train MFS, SVM, k-NN on SemCor examplesStatic PageRankPPRank: Personalized PageRank (same damping factors, iterations)

Use context50 related words [Koeling et al.2005] (BNC, Sports, Finance)

Agirre (UBC) Personalized PageRank over WordNet Google 2009 36 / 40

Page 53: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR and WSD on specific domains [Agirre et al.2009a]

Methods

What would happen if we apply PPR-based WSD to specific domains?

Personalized PageRank over context“. . . has never won a league title as coach but took Parma tosuccess. . . ”

Personalized PageRank over related wordsGet related words from distributional thesaurus [Koeling et al.2005]coach: manager, captain, player, team, striker, . . .

Experiments on BNC, Sports, Finance dataset:Supervised: train MFS, SVM, k-NN on SemCor examplesStatic PageRankPPRank: Personalized PageRank (same damping factors, iterations)

Use context50 related words [Koeling et al.2005] (BNC, Sports, Finance)

Agirre (UBC) Personalized PageRank over WordNet Google 2009 36 / 40

Page 54: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR and WSD on specific domains [Agirre et al.2009a]

Methods

What would happen if we apply PPR-based WSD to specific domains?

Personalized PageRank over context“. . . has never won a league title as coach but took Parma tosuccess. . . ”

Personalized PageRank over related wordsGet related words from distributional thesaurus [Koeling et al.2005]coach: manager, captain, player, team, striker, . . .

Experiments on BNC, Sports, Finance dataset:Supervised: train MFS, SVM, k-NN on SemCor examplesStatic PageRankPPRank: Personalized PageRank (same damping factors, iterations)

Use context50 related words [Koeling et al.2005] (BNC, Sports, Finance)

Agirre (UBC) Personalized PageRank over WordNet Google 2009 36 / 40

Page 55: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR and WSD on specific domains [Agirre et al.2009a]

Methods

What would happen if we apply PPR-based WSD to specific domains?

Personalized PageRank over context“. . . has never won a league title as coach but took Parma tosuccess. . . ”

Personalized PageRank over related wordsGet related words from distributional thesaurus [Koeling et al.2005]coach: manager, captain, player, team, striker, . . .

Experiments on BNC, Sports, Finance dataset:Supervised: train MFS, SVM, k-NN on SemCor examplesStatic PageRankPPRank: Personalized PageRank (same damping factors, iterations)

Use context50 related words [Koeling et al.2005] (BNC, Sports, Finance)

Agirre (UBC) Personalized PageRank over WordNet Google 2009 36 / 40

Page 56: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

PPR and WSD on specific domains [Agirre et al.2009a]

Results

Systems BNC Sports FinancesBaselines Random ∗19.7 ∗19.2 ∗19.5

SemCor MFS ∗34.9 ∗19.6 ∗37.1Static PRank ∗36.6 ∗20.1 ∗39.6

Supervised SVM ∗38.7 ∗25.3 ∗38.7k-NN 42.8 ∗30.3 ∗43.4

Context PPRank 43.8 ∗35.6 ∗46.9Related PPRank ∗37.7 51.5 59.3words [koeling et al. 2005] ∗40.7 ∗43.3 ∗49.7

Skyline Test MFS ∗52.0 ∗77.8 ∗82.3

Supervised (MFS, SVM, k-NN) very low (see test MFS)Static PageRank close to MFSPPRank on context: best for BNC (* for statistical significance)PPRank on related words: best for Sports and Finance and improvesover Koeling et al., who use pairwise WordNet similarity.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 37 / 40

Page 57: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Conclusions

Conclusions

Knowledge-based method for similarity and WSDBased on Personalized PageRankExploits whole structure of underlying KB efficientlyPerformance:

Similarity: best WordNet, comparable with 1.6 Tword, slightly below ESAWSD: Best KB algorithm S2AW, S3AW, Domains datasetsWSD and domains:

Better than supervised WSD for domainsAcquisition of terms and ontology enrichment feasibleInterest in fields like biomedicine, where ontologies exist

Agirre (UBC) Personalized PageRank over WordNet Google 2009 38 / 40

Page 58: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Conclusions

Conclusions

Knowledge-based method for similarity and WSDBased on Personalized PageRankExploits whole structure of underlying KB efficientlyPerformance:

Similarity: best WordNet, comparable with 1.6 Tword, slightly below ESAWSD: Best KB algorithm S2AW, S3AW, Domains datasetsWSD and domains:

Better than supervised WSD for domainsAcquisition of terms and ontology enrichment feasibleInterest in fields like biomedicine, where ontologies exist

Agirre (UBC) Personalized PageRank over WordNet Google 2009 38 / 40

Page 59: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Conclusions

Conclusions

Easily ported to other languagesProvides cross-lingual similarityOnly requirement of having a WordNet

Publicly available at http://ixa2.si.ehu.es/ukbBoth programs and dataIncluding program to construct graphs from new KB (e.g. Wikipedia)GPL license, open source, free

Agirre (UBC) Personalized PageRank over WordNet Google 2009 39 / 40

Page 60: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Conclusions

Conclusions

Easily ported to other languagesProvides cross-lingual similarityOnly requirement of having a WordNet

Publicly available at http://ixa2.si.ehu.es/ukbBoth programs and dataIncluding program to construct graphs from new KB (e.g. Wikipedia)GPL license, open source, free

Agirre (UBC) Personalized PageRank over WordNet Google 2009 39 / 40

Page 61: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Conclusions

Personalized PageRank over WordNetfor Similarity and Word Sense Disambiguation

Eneko [email protected]

(joint work with Aitor Soroa, some slides from Enrique Alfonseca)

University of the Basque Country(Currently visiting Stanford)

Google, 2009

Agirre (UBC) Personalized PageRank over WordNet Google 2009 40 / 40

Page 62: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Conclusions

E. Agirre and A. Soroa.2008.Using the multilingual central repository for graph-based word sensedisambiguation.In Proceedings of LREC ’08, Marrakesh, Morocco.

E. Agirre and A. Soroa.2009.Personalizing pagerank for word sense disambiguation.In Proceedings of EACL-09, Athens, Greece.

E. Agirre, O. Lopez de Lacalle, and A. Soroa.2009a.Knowledge-Based WSD on Specific Domains: Performing better thanGeneric Supervised WSD.In Proceedings of IJCAI, Pasadena, USA.

E. Agirre, A. Soroa, E. Alfonseca, K. Hall, J. Kravalova, and M Pas.2009b.A study on similarity and relatedness using distributional andWordNet-based approaches.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 40 / 40

Page 63: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Conclusions

In Proceedings of annual meeting of the North American Chapter of theAssociation of Computational Linguistics (NAAC), Boulder, USA, June.

L. Finkelstein, E. Gabrilovich, Y. Matias, E. Rivlin, Z. Solan, G. Wolfman,and E. Ruppin.2002.Placing Search in Context: The Concept Revisited.ACM Transactions on Information Systems, 20(1):116–131.

E. Gabrilovich and S. Markovitch.2007.Computing Semantic Relatedness using Wikipedia-based ExplicitSemantic Analysis.Proceedings of the 20th International Joint Conference on ArtificialIntelligence, pages 6–12.

T. H. Haveliwala.2002.Topic-sensitive pagerank.In WWW ’02: Proceedings of the 11th international conference on WorldWide Web, pages 517–526, New York, NY, USA. ACM.

Thad Hughes and Daniel Ramage.Agirre (UBC) Personalized PageRank over WordNet Google 2009 40 / 40

Page 64: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Conclusions

2007.Lexical semantic relatedness with random graph walks.In Proceedings of the 2007 Joint Conference on Empirical Methods inNatural Language Processing and Computational Natural LanguageLearning (EMNLP-CoNLL), pages 581–589.

M. Jarmasz.2003.Roget’s Thesuarus as a lexical resource for Natural LanguageProcessing.

R. Koeling, D. McCarthy, and J. Carroll.2005.Domain-specific sense distributions and predominant sense acquisition.In Proceedings of the Human Language Technology Conference andConference on Empirical Methods in Natural Language Processing.HLT/EMNLP, pages 419–426, Ann Arbor, Michigan.

R. Mihalcea.2005.Unsupervised large-vocabulary word sense disambiguation withgraph-based algorithms for sequence data labeling.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 40 / 40

Page 65: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Conclusions

In Proceedings of HLT05, Morristown, NJ, USA.

R. Navigli and M. Lapata.2007.Graph connectivity measures for unsupervised word sensedisambiguation.In IJCAI.

S. Patwardhan, S. Banerjee, and T. Pedersen.2007.UMND1: Unsupervised Word Sense Disambiguation Using ContextualSemantic Relatedness.In Appears in the Proceedings of SemEval-2007: 4th InternationalWorkshop on Semantic Evaluations, pages 390–393.

R. Sinha and R. Mihalcea.2007.Unsupervised graph-based word sense disambiguation using measuresof word semantic similarity.In Proceedings of the IEEE International Conference on SemanticComputing (ICSC 2007), Irvine, CA, USA.

M. Strube and S.P. Ponzetto.Agirre (UBC) Personalized PageRank over WordNet Google 2009 40 / 40

Page 66: Personalized PageRank over WordNet for Similarity and Word ...ixa2.si.ehu.es/eneko/slides/google.v1.pdf · Similarity Measuring semantic similarity and relatedness are well studied

Conclusions

2006.WikiRelate! Computing Semantic Relatedness Using Wikipedia.In Proceedings of the AAAI-2006, pages 1419–1424.

Agirre (UBC) Personalized PageRank over WordNet Google 2009 40 / 40