word sense disambiguation.ppt

26
Word Sense Disambiguation 2000. 3. 24. 자자자자 자자 자자

Upload: samit-kumar

Post on 12-Apr-2015

22 views

Category:

Documents


1 download

DESCRIPTION

This PPT are for WSD basics

TRANSCRIPT

Page 1: Word Sense Disambiguation.ppt

Word Sense Disambiguation

2000. 3. 24.자연언어 처리 특강

Page 2: Word Sense Disambiguation.ppt

Contents

Introduction and preliminariesSupervised Learning Bayesian Classification Information Theoretic Approach

Dictionary Based Disambiguation Disambiguation based on sense definitions Thesaurus-based Disambiguation Disambiguation based on translations in a

second-language corpus One Sense/Discourse,One Sense/Collocation

Unsupervised Learning

Page 3: Word Sense Disambiguation.ppt

Introduction

Word Sense disambiguationWord sense ambiguity

‘Bank’ : 둑 , 은행 ‘Title’ : 분야에 따라 다른 의미

표제 , 직함 , 권리 , 금의 순도 , 선수권 … In gallery : ‘This work doesn’t have a title’

‘butter’ : 품사에 따른 의미 차이 Semantic Tagging

Page 4: Word Sense Disambiguation.ppt

Preliminaries

Supervised vs. Unsupervised learning Supervised : classification Unsupervised : clustering

Pseudowords Large training/test collection 획득

‘banana-door’ : corpus 의 banana 와 door 에 대한 ambiguity 를 가정

Upper and lower bounds Upper bound : Human power.

Gale et al.’s work : 쌍으로 주어진 문제들에 대해 같은 의미를 갖는지 판단하도록 함 (97%~99% 정확률 )

Lower bound : 많이 쓰이는 의미로 고정했을 때

Page 5: Word Sense Disambiguation.ppt

Supervised Learning

Two ApproachBayesian Classification

Context window 내의 단어들을 source 로 판단

Structure 를 고려하지 않음 Information-theoretic approach

Context 내의 한가지 information feature(indicator) 를 통해 sense 결정

Page 6: Word Sense Disambiguation.ppt

Bayesian Classification

Bayes’s decision rule Baye’s rule

)()(

)|()|( k

kk sP

cP

scPcsP

)|(' maxarg csPs k

sk

)](log)|([log

)()|('

maxarg

maxarg

kk

s

kk

s

sPscP

sPscPs

k

k

Page 7: Word Sense Disambiguation.ppt

Bag of words

Navie Bayes assumptionscontext window ‘c’ 에 대해서

Use MLEP(vj|sk)=C(vj ,sk)/C(sk)P(sk) = C(sk)/C(w)sense s’ 에 대해 (p.238 Fig 7.1)

)|(|}|({)|( kjcinvkjjk svPscinvvPscPj

)](log)|([log' maxarg kk

s

sPscPsk

Page 8: Word Sense Disambiguation.ppt
Page 9: Word Sense Disambiguation.ppt

Gale, Church and Yarowsky(1992)Hansard corpus

duty, drug, land, language,position, sentence

90% 의 정확도Sense[drug] Clues for sense

medication Prices, prescription,patent,increase, consumer, pharmaceutical

Illegal subatance Abuse,paraphernalia,illict, alcohol, cocaine, traffickers

Page 10: Word Sense Disambiguation.ppt

Information-theoretic approach

Brown et al.’s (1991) work 불영 번역 시스템에 사용

I(P; Q) 를 최대화 하는 Indicator 를 사용 P: 대역어 집합 , Q : indicator value 집합 Mutual information

Ambiguous word Indicator Examples: valuesense

prendre object Measureto takeDecision to make

voulouir tense Present to wantConditional to like

Cent Word to the left Per%Numberc.[money]

Xx Yy ypxp

yxpyxpYXI

)()(

),(log),();(

Page 11: Word Sense Disambiguation.ppt

Algorithm

Maximize I(P; Q)모든 가능한 indicator 에 대해 계산 I(P;Q) 가 가장 커지는 indicator 와 Q 의

partition set 을 구함Flip-Flop algorithm(p. 240, Fig 7.2)

Find random partition P={P1,P2} of {T1…Tm}While (improving) do

Find partition Q={Q1,Q2} of {X1…Xn} maximizes I(P;Q)Find partition P={P1,P2} of {t1…tm} maximizes I(P;Q)

End(T1…Tm : tranlation word, X1…Xn : indicator’s possible value)

Page 12: Word Sense Disambiguation.ppt

Dictionary-Based Disambiguation

단어의 의미분류에 대한 정보가 없을 때세가지 접근 방법사전의 의미정보 만을 사용 (Lesk, 1986)시소러스 정보 사용 (Yarowsky, 1992)Bilingual dictionary 와 이언어 corpus

사용 (Dagan and Itai,1994)

Page 13: Word Sense Disambiguation.ppt

Disambiguation based on sense definitions

사전의 정의를 사용 D1…Dk 에 대해 ,s1…sk 의 의미를 설정 Algorithm(p.243, Fig 7.3)

Accuracy : 50% ~ 70%

comment: Given context cfor all senses sk of w do

score(sk) = overlap(Dk, Evj)ends’=argmax score(sk)

*.Evj : context 에 있는 사전 정의문의 단어들

Page 14: Word Sense Disambiguation.ppt

Example

word ‘ash’사전정의

scoring

sense Definition

s1 tree a tree of the olive family

s2 burned stuff the solid residue left when combustible matrial is burned

Scores Context

s1 s2

0 1 This cigar burns slowly and creates a stiff ash.

1 0 The ash is one of the last tress to com into leaf.

Page 15: Word Sense Disambiguation.ppt

Thesaurus-based Disambiguation

시소러스의 의미 분류 정보를 사용 Walker’s algorithm (1987) (p.245, Fig. 7.4)

Yarowsky’s algorithm Baye’s classifier 사용 context 의 category 를 구하고 , 그것을 이용해 단어의 c

atetgory 를 구해 의미를 결정한다

comment: given context cfor all senses sk of w do

score(sk) = vj in c (t(sk),vj)ends’ = arg max score(sk)

*. (t(sk),vj) = 1 , iff t(sk) 가 vj 의 subject code 에 포함될 때 = 0, 그 밖의 경우

Page 16: Word Sense Disambiguation.ppt

Yarowsk’s algorithm

context 의 score 계산 (p.246, Fig 7.5)Navie Bayes assumption

score(ci,tl) = P(tl|ci)

sense s’ 에대해 ,

)()(

)|(

)()(

)|()|( l

vinc

vincl

li

liil tp

vP

tvP

tPcp

tcPctP

i

i

))]((log))(|([log' maxarg kk

s

stPstcPsk

Page 17: Word Sense Disambiguation.ppt
Page 18: Word Sense Disambiguation.ppt

Some Results

Roget categories

Word Sense Roget category Accuracy

bass Musical senses MUSIC 99%

fish ANIMAL,INSECT 100%

star space object UNIVERSE 96%

celebrity ENTERTAINER 95%

star shaped object

INSIGNIA 82%

interest

curiosity RESONING 88%

advantage INJUSTICE 34%

financial DEBT 90%

share PROPERTY 38%

Page 19: Word Sense Disambiguation.ppt

Disambiguation based on translations in a second-language

corpusDagan and Itai(1994) 번역어의 분포에 따라 의미 결정 Algorithm(p.249, Fig 7.6)

공기어의 대역어에 대한 코퍼스의 분포로 의미 결정

comment: Given : a context c in which w occurs in relation R(w,v)for all senses sk of w do

•score (sk)= |{cS | w’ T(sk), v’ T(v): R(w’,v’) c}|ends’ =arg max score(sk)

*. S : second language corpus*. T(x) : possible translation of x

Page 20: Word Sense Disambiguation.ppt

Example

‘interest’

‘show interest’ : show zeigenzeigen 은 interesse 와 붙어 나오게 됨sense2 선택

sense1 sense2

Definition legal share attention, concern

Translation Beteiligung Interesse

English collocation

acquire an interest

show interest

Translation Beteiligung erwerben Interesse zeigen

Page 21: Word Sense Disambiguation.ppt

One Sense per Discourse,One Sense per CollocationOne sense per discourse한 문서 내에서 단어는 한가지 sense 를 갖게

될 확률이 높다One sense per collocation가까이 있는 단어는 목적 단어의 sense 의

힌트가 되기 쉽다collocation 정보를 이용해 단어의 sense

결정 (collocation word f : ))|(

)|(

2

1

fsP

fsP

k

k

Page 22: Word Sense Disambiguation.ppt
Page 23: Word Sense Disambiguation.ppt

Unsupervised Disambiguation

Completely unsupervised disambiguationsense tagging 은 불가능context-group 판별

clustering 을 통해 groupingGale et al.’s Baye’s classifier 와 유사한 확률

모델 정해진 K 에 대하여 s1… sK 의 group(sense) 가정 P(sk|c) 값 계산 EM algorithm (p.254 Fig 7.8) 으로 확률값 계산

Page 24: Word Sense Disambiguation.ppt
Page 25: Word Sense Disambiguation.ppt

Unsupervised Disambiguation (cont.)

K 값의 결정K 값이 커지면 sense 구분이 세밀해 짐

많은 training corpus 필요corpus 양에 따라 결정

사전의 참조나 , tagging 된 corpus없이 sense 차이를 구분 할 수 있다 .정보검색에 유용

Page 26: Word Sense Disambiguation.ppt

Word Sense

Word Sense 란 ? 의미의 차이에 대한 정신의 표현 sense 를 정하는 기준 : 정신의 올바른 표현인가 ?

Systematic Polysemy Co-activation (p.258 7.9, 7.10) ‘the act of X’ and ‘the people doing X’

Organization, administration, formation … Proper nouns : Brown, Bush, Army …

Application