automatic term ambiguity detection

Automatic Term Ambiguity Detection

Tyler Baldwin, Yunyao Li, Bogdan Alexe, Ioana R. Stanoi

IBM Research - Almaden

What is the buzz about Brave on Twitter?

Find tweets about the movie Brave:

Movie night watching brave with Cammie n Isla n loads munchies

This brave girl deserves endless retweets!

Watching brave with the kiddos!

watching Bregor playing Civ 5: Brave New World and thinking of getting it

Skyfall 007 in class with @MariaWiheelste

So I was dead set on seeing skyfall 007 for like a year

NowWatching #skyFall 007!

What movie amazed u — skyfall 007

Existing Disambiguation Methods

Word Sense Disambiguation (WSD) Which word sense does this instance refer to?

Named Entity Disambiguation (NED) Which entity type is this instance associated with?

Existing Disambiguation Methods

Word Sense Disambiguation (WSD) Which word sense does this specific instance refers to?

Named Entity Disambiguation (NED) Which entity type is this individual instance associated with?

Limitations: Assume the number of senses/entities is known

− Often not the case Inefficient on very large data sets

− Attempt to disambiguate each instance

Term Ambiguity Detection (TAD)

Perform term disambiguation at the term, not instance level Given a term T and its category C, do all the

mentions of the term reference a member of that category?


Perform term disambiguation at the term, not instance level Given a term T and its category C, do all the

mentions of the term reference a member of that category?

Level of ambiguity of the term Hybrid information extraction (IE) systems

− Simpler model if the term unambiguous− More complex model otherwise

Potentially useful for other NLP tasks


CameraEOS 5D

Video GameA New Beginning

MovieSkyfall 007

MovieBrave

CategoryTerm

Video GameA New Beginning

MovieBrave

CategoryTerm

Ambiguous

CameraEOS 5D

MovieSkyfall 007

CategoryTerm

Unambiguous

TAD

TAD Framework

Step 1: N-gram

Step 2: Ontology

Step 3: Clustering

Ambiguous

Unambiguous

TAD Framework

Step 1: N-gramDoes the term share a namewith a common word/phrase?

1. Normalize input term t (stopword removel + lowercase)

2. Calculate unigram probability

3. Ambiguous if the probability is above the empirically determined threshold

Ambiguous

Unambiguous

TAD Framework

Step 1: N-gram

Step 2: Ontology

• Wiktionary:Ambiguous if term has several

senses in Wiktionary

• Wikipedia:Ambiguous if term has a Wikipedia disambiguation page

Ambiguous

Unambiguous

TAD Framework

Step 1: N-gram

Step 2: Ontology

Step 3: ClusteringCluster the contexts in which the term appear

Ambiguous

Unambiguous

1. Remove stopwords and infrequent words from all documens containing the term

2. Cluster the document using Latent Dirichlet Allocation (LDA)

3. Ambiguous if category term or WordNet synonym does not appear in the most heavily weighted terms of any cluster

Evaluation

Dataset: terms from 4 product domains: Movies, Video Games, Cameras, Books

− 100 terms per domain− Extracted randomly from dbpedia and Flickr

Gold standard: ambiguity determined by examining usage in TREC Tweets2011 corpus 10 tweets labeled per term

− Unambiguous only if all tweets reference category

Questions to Answer

How effective is TAD?

How useful is TAD?

Results - Effectiveness

Each module produced above baseline performance

Configuration Precision Recall F-measure

Majority Class 0.675 1.0 0.806

N-gram (NG) 0.979 0.848 0.909

Ontology (ON) 0.979 0.704 0.819

Clustering (CL) 0.946 0.848 0.895

NG + ON 0.980 0.919 0.948

NG + CL 0.942 0.963 0.952

ON + CL 0.945 0.956 0.950

All 0.943 0.978 0.960


Ontology method is of limited usage, as most of the terms cannot be found in the ontology.



N-gram (NG) 0.979 0.848 0.909

Ontology (ON) 0.979 0.704 0.819

Clustering (CL) 0.946 0.848 0.895

NG + ON 0.980 0.919 0.948

NG + CL 0.942 0.963 0.952

ON + CL 0.945 0.956 0.950

All 0.943 0.978 0.960


Each module produced above baseline performance Combined framework produced high F-measure of 0.96



N-gram (NG) 0.979 0.848 0.909

Ontology (ON) 0.979 0.704 0.819

Clustering (CL) 0.946 0.848 0.895

NG + ON 0.980 0.919 0.948

NG + CL 0.942 0.963 0.952

ON + CL 0.945 0.956 0.950

All 0.943 0.978 0.960

Results - Usefulness

Integrated TAD pipeline into commercially available IE system Extracted mentions of terms from Camera and

Video game domains on Twitter data Manually judged relevance of extracted Tweets

Results - Usefulness

Using ambiguity detection hurt recall Only 57% of the relevant documents returned

with TAD Ambiguity detection necessary for high

precision w/ ambiguity detection:

− Precision: 0.96 w/o ambiguity detection

− Precision: 0.16

Conclusion

Term ambiguity detection is helpful for large-scale information extraction Able to detect ambiguity when number of senses is

unknown Able to be applied to large datasets where instance-

level interpretation is impractical 3-Module TAD approach results is high

performance Detects ambiguity with F-measure of 0.96 Allows IE system to produce high precision

BACKUP

TAD FrameworkN-gram

suggests non-referential instances

Ontology suggests across domain

instances

Clustering suggests

either case

Ambiguous Terms

Unambiguous Terms

Yes

Yes

Yes

No

No

No

N-gram

Ontology

Clustering

automatic term ambiguity detection

Technology