word sense disambiguation study on word net ontology

30
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G

Upload: seth

Post on 10-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Word sense disambiguation Study on word net ontology. Akilan Velmurugan Computer Networks – CS 790G. Overview. What is WSD ? How wordnet is analyzed as a Complex Network What are the results Project Methodology Area of study Key Findings/Results New approaches Improvement techniques - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Word sense disambiguation Study on word net ontology

WORD SENSE DISAMBIGUATION

STUDY ON WORD NET ONTOLOGYAkilan VelmuruganComputer Networks – CS 790G

Page 2: Word sense disambiguation Study on word net ontology

Overview

What is WSD ? How wordnet is analyzed as a Complex

Network What are the results

Project Methodology Area of study

Key Findings/Results New approaches Improvement techniques

Conclusion

Page 3: Word sense disambiguation Study on word net ontology

Project Description

Objective Study on WSD

Effects of WSD in Word Sense Ontology Characteristics of WordNet

Results How do match words with other words

Parameters taken for study of word sense Improvise them by making necessary changes

Study network characteristics

Page 4: Word sense disambiguation Study on word net ontology

WordNet - overview

Machine readable semantic dictionary interlinked by semantic relations

Developed at Princeton University as a large lexical database for English language

Most widely used linguistic resource Free for public (GPL ) Forms a scale free network with small

average shortest path having words as nodes and concepts as links

Easily navigable

Page 5: Word sense disambiguation Study on word net ontology

WordNet (Structure)

Shows the relation in the form of Noun, Verb, Adjective, adverb

Synonym Hypernym (Is a kind of …) Hyponym (… Is a kind of) Troponym (particular ways to …) Meronym (parts of …) ---- about 25 relations

Also available for online navigation

Page 6: Word sense disambiguation Study on word net ontology

WordNet online - by Princeton University

Page 7: Word sense disambiguation Study on word net ontology

WordNet Browser

Page 8: Word sense disambiguation Study on word net ontology

WordNet (working)

WSD: Corpus based approaches

Set of samples that enables the system Knowledge based approaches

Machine readable dictionary with relations

WordNet Research Open source

Ranking of synsets derived from word frequencies in the British National Corpus

Top 1000 Content manipulation of text

Dataset I – controlled and calibrated study Dataset II – collected using mechanical trunk using pairs

Page 9: Word sense disambiguation Study on word net ontology

Word Sense Disambiguation (WSD) Task of determining the meaning of an

ambiguous word in the given context Bank

Edge of a riveror

Financial institution that accepts money Refers to the resolution of lexical

semantic ambiguity and its goal is to attribute the correct senses to words (AI-complete problem)

Page 10: Word sense disambiguation Study on word net ontology

WSD: Area of Research

Assigning correct sense to words having electronic dictionary as source of word definitions

Open research field in Natural Language Processing (NLP)

Hard Problem which is a popular area for research

Used in speech synthesis by identifying the correct sense of the word

Page 11: Word sense disambiguation Study on word net ontology

JavaScript Visual WordNet

Page 12: Word sense disambiguation Study on word net ontology

Visual Thesaurus

Page 13: Word sense disambiguation Study on word net ontology

WordNet – Theoretical aspects Wordnet – word sense ontology

Symbols are words Synset: list of words and semantic relations

between them Word sense disambiguation

Wordnet structure using latent semantics Variable lexical notation for a concept Citibase – Thesaurus Semantic relatedness And few others…

Page 14: Word sense disambiguation Study on word net ontology

WSD: using latent semantics Measures the semantic distance of concepts Relatedness and between-ness are calculated Matrix form of wordnet data structure is used Can be used to integrate with other applications Uses Singular Value Decomposition (SVD)

algorithm Example: Multiple synsets are

{car, gondola} {car, railway car} {car, automobile}

{Motor vehicle}, {Coupe}, {Sedan}, {Taxi}

Page 15: Word sense disambiguation Study on word net ontology

MDS-example

1 2 3 4 5 6 7 8 9 10 11 12 131 0 1 1 1 2 2 3 1 1 2 4 2 22 1 0 2 2 1 2 3 2 2 3 4 3 33 1 2 0 2 3 3 4 2 2 3 5 3 34 1 2 2 0 3 2 3 2 2 1 4 1 35 2 1 3 3 0 1 2 2 2 2 3 3 36 2 2 3 2 1 0 1 1 1 1 2 2 27 3 3 4 3 2 1 0 2 2 2 1 3 38 1 2 2 2 2 1 2 0 2 2 3 3 19 1 2 2 2 2 1 2 2 0 2 3 3 1

10 2 3 3 1 2 1 2 2 2 0 3 1 3

11 4 4 5 4 3 2 1 3 3 3 0 4 412 2 3 3 1 3 2 3 3 3 1 4 0 413 2 3 3 3 3 2 3 1 1 3 4 4 0

1, 2, 3, 4, 10,

12

5, 6, 7, 8, 9, 11,

13

Geodesic Distance Matrix

MDS

k-means

S

15

Page 16: Word sense disambiguation Study on word net ontology

WSD: using latent semantics

Page 17: Word sense disambiguation Study on word net ontology

WSD: variable lexical notations for a concept

Generic concept notation: D = I ∪ J ∪ K∴ J = D − (I ∪ K) = (D − I )∩(D − K) = D∩ (I∪ K) J = D∩ ( I ∩K)since, B = D ∪ E ∪ F D = B − (E∪F) =(B − E)∩(B − F) = B∩(E ∪F) D =B ∩(E ∩ F)

Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications

¯¯¯¯

¯ ¯

¯¯¯¯

¯ ¯

Page 18: Word sense disambiguation Study on word net ontology

WSD: variable lexical notations for a concept

J = D∩ ( I ∩K) =( B∩(E ∩ F) )∩( I ∩ K) J = B∩( (E ∩ F)∩( I ∩

K) )when J = fly, D = fish lure I = spinner k = troll And introducing boolean

operators, AND for ∩ OR for ∪ NOT for

¯ ¯

¯ ¯ ¯ ¯

¯ ¯ ¯ ¯

¯Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications

Page 19: Word sense disambiguation Study on word net ontology

WSD: variable lexical notations for a concept

(“fly”) becomes : (“fisherman's lure” OR “fish

lure”) AND ( (NOT “spinner”) AND (NOT “troll”) )

then B = lure,

E = ground bait,

F = stool pigeon

(“fly”) becomes :

(“bait” OR “decoy” OR “lure”) AND ( ((NOT “ground bait”) AND (NOT “stoolpigeon”) AND((NOT “spinner”)AND(NOT “troll”)) )

Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications

Page 20: Word sense disambiguation Study on word net ontology

Thesaurus as a complex network

As a Directed Graph sink composed of the 73,046

terms with kout = 0 source are the 30,260 terms

with at least one outgoing link (kout > 0) – Root words

absolute source : without incoming links kin = 0

normal source : (kout > 0 and kin > 0)

bridge source : without outgoing links to root words (kout(source) = 0)

1 – Normal source2 – Bridge source3 – Absolute source4 – sink

Source: arXiv:cond-mat/0312586 v1 2003

Page 21: Word sense disambiguation Study on word net ontology

WSD: Semantic relatedness and word sense disambiguation

Source: Proceedings of the 20th International Conference on Advanced Information Networking and Applications

Concepts that occur more frequently and closer with each others are “more related” to each others than the concepts that appear less frequently and farther one

Page 22: Word sense disambiguation Study on word net ontology

WordNet Relationship

Semantic relatedness Involves relationships among words

car-wheel (meronym) hot-cold (antonym) pencil-paper (functional) penguin-antarctica (association) Bank-trust company (synonym)

Probability and Distance calculation Frequency of synsets or words

Performance in NLP applications

Page 23: Word sense disambiguation Study on word net ontology

WordNet Relationship Browser

Page 24: Word sense disambiguation Study on word net ontology

WordNet Connect

Program to find all possible connections between two words in WordNet

Used in computing Semantic Opposition among word sense ontology

WordNet lexical database dictionary is used to read the semantic relations

Capabilities like number of paths, shortest path, overall network structure is studied

Page 25: Word sense disambiguation Study on word net ontology

WordNet Connect

Page 26: Word sense disambiguation Study on word net ontology

WordNet Connect

Page 27: Word sense disambiguation Study on word net ontology

WordNet Connect

Page 28: Word sense disambiguation Study on word net ontology

Future work

WordNet structure in terms of complex network

Key assumptions WordNet lexical dictionary analyzed under the

scope of source node, target node with an additional reference node

Achieve a cost effective path which is conditionally related to mean reference node

Control the path traversal with a relation of focus Include Common File Number to make it more

efficient

Page 29: Word sense disambiguation Study on word net ontology

Conclusion

A single visualization can not reveal the entire structure of wordnet

There are different ways of analyzing the effectiveness of the overall system

A new method to evaluate the usefullness of the WordNet network structure

Page 30: Word sense disambiguation Study on word net ontology

Questions and Comments