align, disambiguate, and walk a unified approach for measuring semantic similarity

Align, Disambiguate, and WalkA Unified Approach for Measuring Semantic Similarity

Mohammad Taher PilehvarDavid JurgensRoberto Navigli

Semantic Similarity; how

similar are a pair of lexical items?

Semantic Similarity

Sentence Level Word Level Sense Level

Semantic Similarity

The worker was terminated

The boss fired him

Sentence level

Applications• Paraphrase recognition

(Tsatsaronis et al., 2010)

• MT evaluation (Kauchak and Barzilay, 2006)

• Question Answering (Surdeanu et al., 2011)

• Textual Entailment (Dagan et al., 2006)

Semantic Similarity

Semantic SimilarityWord level

fireplace

Applications• Lexical simplification (Biran et al., 2011)

Locuacious → Talkative

• Lexical substitution (McCarthy and Navigli, 2009)

heater

Semantic Similarity

Semantic Similarityfire sense #1 Sense level

fire sense #8

Applications• Coarsening sense inventories

(Snow et al., 2007)

• Semantic priming (Neely et al., 1989)

Exisiting Similarity Measures

Patwardan (2003)

Banerjee and Pederson (2003)

Hirst and St-Onge (1998)

Lin (1998)

Jiang and Conrath (1997)

Resnik (1995)

Sussna (1993, 1997)

Wu and Palmer (1994)

Leacock and Chodorow (1998)

Salton and McGill (1983)

Gabrilovich and Markovitch (2007)

Radinsky et al. (2011)

Ramage et al. (2009)

Yeh et al., (2009)

Turney (2007)

Landauer et al. (1998)

Allison and Dix (1986)Gusfield (1997)

Wise (1996)Keselj et al. (2003)

Sentence Word Sense

Exisiting Similarity Measures

Sense Word Sentence

Patwardan (2003)

Banerjee and Pederson (2003)

Hirst and St-Onge (1998)

Lin (1998)

Jiang and Conrath (1997)

Resnik (1995)

Sussna (1993, 1997)

Wu and Palmer (1994)

Leacock and Chodorow (1998)

Salton and McGill (1983)

Gabrilovich and Markovitch (2007)

Radinsky et al. (2011)

Ramage et al. (2009)

Yeh et al., (2009)

Turney (2007)

Landauer et al. (1998)

Allison and Dix (1986)Gusfield (1997)

Wise (1996)Keselj et al. (2003)

Different internal representationswhich are not comparable to each other

None directly covers all levels at the same time

Different output scales

Contribution

Sense Word sentence

State-of-the-art performance in each level Using only WordNet

A unified representationfor any lexical item

Advantage 1Unified representation

sentencetext word

All lexical itemsthis way

Advantage 2Cross-level semantic similarity

A large and imposing house Mansion Residence#3

sentence

Advantage 3Sense-level operation

set of sensessense

Outline

Introduction Methodology Experiments

How Does it work?

lexical item 1 lexical item 2

semanticsignature 1

semanticsignature 2

Semantic Signature

sense word

Unified semantic signature

sentence

Semantic Signature

sense word

Sense or set of senses

Alignment-based disambiguation

sentence

A woman is frying food

Semantic Signature

Semantic SignatureDistributional representation

over all synsets in WordNet

Importance of this synset (syn_4) for our lexical item

syn_1syn_2

syn_3syn_4

syn_5syn_6

syn_7syn_8

syn_9syn_n

Semantic Signaturea woman is frying food

syn_1syn_2

syn_3syn_4

syn_5syn_6

syn_7syn_8

syn_9syn_n

cooking#1

table#3frying_pan#1

oil#4kitchen#3

carpet#2

sugar#2

natural_gas#2

physics#1

ship#1

Personalized PageRank

vfry 2

nfood 1 nwoman 1

a woman is frying food{ , , }nwoman 1

nfood 1vfry 2

nfood 1

ncooking 1

vcook 3

nfat 1

nfrench_fries 1

ndish 2nnutriment 1

nfood 2

nbeverage1

These weights form a semantic signature

Comparing Semantic Signatures

• Parametric– Cosine

• Non-parametric– Weighted Overlap– Top-k Jaccard

Comparing Semantic SignaturesWeighted Overlap

syn_1 syn_2 syn_3 syn_4 syn_5 syn_6

Comparing Semantic SignaturesWeighted Overlap

syn_4 syn_5 syn_2 syn_6 syn_1 syn_3

Comparing Semantic SignaturesTop- Jaccard

syn_1 syn_2 syn_3 syn_4 syn_5 syn_6 syn_7 syn_8 syn_9 syn_10

𝑘=4

Comparing Semantic SignaturesTop- Jaccard

syn_1 syn_2 syn_3 syn_4 syn_5 syn_6 syn_7 syn_8 syn_9 syn_10

𝑘=4

sense word

sentence

sense word

sentencesentence

Why is disambiguation needed?

The worker was fired

He was terminated

A manager fired the worker.An employee was terminated from work by his boss.

managern

employeen

firev workern terminatev bossnworkn

manager terminate work boss

bosswork

terminate

employee

nmanager 2

fire worker

worker

managern

employeen

Sentence 1 Sentence 2

bosswork

terminate

employee

nmanager 2

fire worker

worker

managern

employeen

manager

terminate work boss

bosswork

terminate

employee

nmanager 2

0.50.3

Tversky (1977)Markman and Gentner (1993)

terminate work boss

bosswork

terminate

employee

nmanager 2n

managern1

0.50.3

bosswork

terminate

employee

nmanager 2

fire worker

worker

managern

employeen

fire v4

bosswork

terminate

employee

nmanager 2

fire worker

worker

managern

employeen

fire v4

Outline

Introduction Methodology Experiments

Experiments• Sentence level– Semantic Textual Similarity (SemEval-2012)

• Word level– Synonymy recognition (TOEFL dataset)– Correlation-based (RG-65 dataset)

Experiments• Sentence level– Semantic Textual Similarity (SemEval-2012)

• Sense level– Coarsening WordNet sense inventory

Experiment 1Similarity at Sentence level

• Semantic Textual Similarity (STS-12)– 5 datasets – Three evaluation measures

• ALL, ALLnrm, and Mean

• Top-ranking systems– UKP2 (Bär et al., 2012)– TLSim and TLSyn (Šarić et al., 2012)

Features– Main features

• Cosine• Weighted Overlap• Top-k Jaccard

– String-based features• Longest common substring• Longest common subsequence• Greedy string tiling• Character/word n-grams

STS Results

ALLnrm

0.6 0.65 0.7 0.75 0.8 0.85 0.9

+0.034

+0.007

+0.043

Experiments• Sentence level– Semantic Textual Similarity (Semeval-12)

Experiment 2Similarity at Word Level

Synonymy recognition Correlating word similarity judgments

Similarity at Word LevelSynonym Recognition

• TOEFL dataset (Landauer and Dumais, 1997)

– 80 multiple choice questions– Human test takers: 64.5% only

Similarity at Word LevelSynonym Recognition

Accuracy on TOEFL dataset

PPMIC (Bullinaria and Levy, 2007)

GLSA (Matveeva et al., 2005)

LSA (Rapp, 2003)

ADW_Jaccard

ADW_WO

ADW_Cosine

PR (Turney et al., 2003)

PCCP (Bullinaria and Levy, 2012)

75 80 85 90 95 100

Similarity at Word LevelJudgment Correlation

• Dataset: RG-65 (Rubenstein and Goodenough, 1965)

– 65 word pairs • judged by 51 human subjects

– Scale of 0 → 4

Similarity at Word LevelJudgment Correlation

Spearman correlation, RG-65 dataset

ADW_Cosine

Agirre et al. (2009)

Hughes and Ramage (2007)

Zesch et al. (2008)

ADW_Jaccard

ADW_WO

0.76 0.78 0.8 0.82 0.84 0.86 0.88 0.9

Experiments• Text level– Semantic Textual Similarity (Semeval-12)

Experiment 3Similarity at Sense Level

• Coarse-graining WordNet

Snow et al. (2007)Navigli (2006)

• Binary classification: Merged or not-merged

bass#1

bass#3

bass#2

bass#4

Senseval-2 English Lexical Sample WSD

task(Kilgarriff, 2001)

OntoNotes (Hovy et al., 2006)

about 3500 sense pairs (Noun)about 5000 sense pairs (Verb)

about 16000 sense pairs (Noun)about 31000 sense pairs (Verb)

• Binary classification: Merge or not-merged

Merged

Not-merged

Tuned using 10% of the dataset

if similarity ≥ 𝑡

h𝑜𝑡 𝑒𝑟𝑤𝑖𝑠𝑒

F-score on OntoNotes dataset

ADW_Cosine

ADW_WO

ADW_Jaccard

Snow et al. (2007)

Navigli (2006)

0 0.2 0.4 0.6

F-score on Senseval-2 dataset

ADW_Cosine

ADW_WO

ADW_Jaccard

Snow et al. (2007)

Navigli (2006)

0 0.2 0.4 0.6

0 0.1 0.2 0.3 0.4 0.5 0.6

Conclusions• A unified approach for computing semantic

similarity for any pair of lexical items

• Experiments with SOA performance– Sense level (sense coarsening)– Word level (synonymy recognition and judgment)– Sentence level (Semantic Textual Similarity)

Future Direction• Larger sense inventories (e.g., BabelNet)

• Cross-level semantic similarity

• Create datasets for cross-level similarity– Future Semeval task?

nlistening 1

nthank_you 1

ngratitude 1

nthanks 1

nexpression 3

vheed 1nsensing 2

vlisten 2

nauscultation 1near 2

nhearing 6

jaudio-lingual 1

vlisten 1

Thank you for listening!

align, disambiguate, and walk a unified approach for measuring semantic similarity

Documents

body align review | body align review business opportunity

using randomization to attack similarity digests · using...

welcome! align-connect-engage align-connect- engage...

ohio’s state tests...similarity. use congruence and...

proximty align

align assembly_600_500e

database similarity search. 2 sequences that are similar...

proving triangle similarity using sas and sss similarity

quantifying species similarity and species diversity …...

triangle similarity: aa, sss, sastriangle similarity: aa...

1 lesson 3.4.6 congruence and similarity congruence and...

align f2030

text similarity - columbia universitycompute similarity •...

similarity in triangles unit 13 notes definition of...

topic 5-unit 4: similarity module 11: similarity and

align yourself - amazon web services · align yourself...

listeners use descriptive contrast to disambiguate novel

similarity checker ‘turn it in’ guide for...

efﬁcient similarity search : arbitrary similarity ...

triangle similarity: aa, sss, sastriangle similarity: aa