information retrieval using word senses: root sense tagging approach sang-bum kim, hee-cheol seo and...

Information Retrieval using Information Retrieval using Word Senses: Root Sense Word Senses: Root Sense

Tagging ApproachTagging Approach

Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang RimNatural Language Processing Lab.,

Department of Computer Science and Engineering, Korea University

IntroductionIntroduction

Since natural language has its lexical ambiguity, Since natural language has its lexical ambiguity, it is predictable that text retrieval systems it is predictable that text retrieval systems benefits from resolving ambiguities from all words benefits from resolving ambiguities from all words in a given collection.in a given collection.

Previous IR experiments using word senses have Previous IR experiments using word senses have shown disappointing results.shown disappointing results.

Some reasons for previous failures: skewed sense Some reasons for previous failures: skewed sense frequencies, collocation problem, inaccurate frequencies, collocation problem, inaccurate sense disambiguation, etc.sense disambiguation, etc.


WSD for IR tasks should be performed on all WSD for IR tasks should be performed on all ambiguous words in a collection since we cannot ambiguous words in a collection since we cannot know user query in advance.know user query in advance.

Performance of WSD reach at most about 75% Performance of WSD reach at most about 75% precision and recall on all word task in SENSEVAL precision and recall on all word task in SENSEVAL competition. (about 95% in the lexical sample competition. (about 95% in the lexical sample task)task)


Some observations that sense disambiguation for Some observations that sense disambiguation for crude tasks such as IR is different from crude tasks such as IR is different from traditional word sense disambiguation.traditional word sense disambiguation.

It’s arguable that fine-grained word sense It’s arguable that fine-grained word sense disambiguation is necessary to improve retrieval disambiguation is necessary to improve retrieval performance.performance.

ex: “stock” has 17 different senses in WordNet.ex: “stock” has 17 different senses in WordNet.

For IR, For IR, consistent consistent disambiguation is more disambiguation is more important than important than accurate accurate disambiguation, and disambiguation, and flexibleflexible disambiguation is better than disambiguation is better than strictstrict disambiguation.disambiguation.

Root Sense Tagging ApproachRoot Sense Tagging Approach

This approach aims to improve the performance This approach aims to improve the performance of large-scale text retrieval by conducting of large-scale text retrieval by conducting coarse-grainedcoarse-grained, consistent, and flexible WSD., consistent, and flexible WSD.

25 25 root sensesroot senses for the nouns in WordNet 2.0 are for the nouns in WordNet 2.0 are used.used.

ex:ex:

““story” has 6 senses in WordNetstory” has 6 senses in WordNet－－ {message, fiction, history, report, fib} are from the {message, fiction, history, report, fib} are from the same root same root sense of “relation”.sense of “relation”.

－－ {floor} is from the root sense of “artifact”.{floor} is from the root sense of “artifact”.


The root sense tagger classifies each noun in the The root sense tagger classifies each noun in the documents and queries into one of the 25 root senses, documents and queries into one of the 25 root senses, so it is called so it is called coarse-grained disambiguationcoarse-grained disambiguation..


When classifying a given ambiguous word, the When classifying a given ambiguous word, the most informative neighboring clue word have the most informative neighboring clue word have the highest MI with the given word is selected.highest MI with the given word is selected.

The single most probable sense among the The single most probable sense among the candidate root senses for the given word is candidate root senses for the given word is chosen according to the MI between the selected chosen according to the MI between the selected neighboring clue word and each candidate root neighboring clue word and each candidate root sense.sense.


There are 101,778 There are 101,778 non-ambiguous unitsnon-ambiguous units in in WordNet 2.0.WordNet 2.0.

ex: “actor” = {role player, doer} → personex: “actor” = {role player, doer} → person

““computer system” → artifactcomputer system” → artifact

Co-occurrence Data Co-occurrence Data ConstructionConstruction

The steps to extract co-occurrence information The steps to extract co-occurrence information from each document:from each document:

1.1. Assign root sense to each non-ambiguous noun in Assign root sense to each non-ambiguous noun in the document.the document.

2.2. Assign a root sense to each second noun of non-Assign a root sense to each second noun of non-ambiguous compound nouns in the document.ambiguous compound nouns in the document.

3.3. Even if any noun tagged in step 2 occurs alone in Even if any noun tagged in step 2 occurs alone in other position, assign the same root sense in step other position, assign the same root sense in step 2.2.

4.4. For each sense-assigned noun in the document, For each sense-assigned noun in the document, extract all extract all ((context wordcontext word, , sensesense) pairs within a ) pairs within a predefined window.predefined window.

5.5. Extract all (Extract all (wordword,, word word)) pairspairs..

Co-occurrence Data Co-occurrence Data ConstructionConstruction

MI-based Root Sense TaggingMI-based Root Sense Tagging

““system” has 9 fine-grained senses in WordNet, and 5 system” has 9 fine-grained senses in WordNet, and 5 root senses: root senses: artifactartifact, , cognitioncognition, , bodybody, , substancesubstance and and attributeattribute..

Indexing and RetrievalIndexing and Retrieval

26-26-bit sense field is added to each term posting bit sense field is added to each term posting element in index.element in index. 1 bit is used for 1 bit is used for unkunk assigned to unknown words. assigned to unknown words.

If s(If s(ww) is set to null or ) is set to null or ww is not a noun, all the bits is not a noun, all the bits are 0.are 0.

Two situations must be considered:Two situations must be considered: Several different root senses may be assigned to Several different root senses may be assigned to

the same word within a document.the same word within a document. Only nouns are sense tagged, but a verb with the Only nouns are sense tagged, but a verb with the

same indexing keyword form may exist in the same indexing keyword form may exist in the document.document.


A A sense-oriented term weighting methodsense-oriented term weighting method is proposed. is proposed. Traditional term-based index.Traditional term-based index. Term weight is transformed by using sense weight Term weight is transformed by using sense weight swsw

calculated by referring to the sense field.calculated by referring to the sense field.

Sense weight Sense weight swswijij for term for term ttii in document in document ddjj is defined as: is defined as:

swswijij = 1 += 1 + α α ．． qq((dsfdsfijij, , qsfqsfii))

where where dsfdsfijij and and qsfqsfii indicate the sense field of term indicate the sense field of term ttii in in document document ddii and query respectively. Sense-matching and query respectively. Sense-matching function function qq is defined as: is defined as:

Data and Evaluation Data and Evaluation MethodologiesMethodologies

Two document collections and two query sets are Two document collections and two query sets are used.used. DocumentsDocuments

210,157 documents of Financial Times collection in 210,157 documents of Financial Times collection in TREC CD vol.4TREC CD vol.4

127,742 documents of LA Times collection in vol.5127,742 documents of LA Times collection in vol.5

QueriesQueries TREC 7(351-400) and TREC 8(401-450) queriesTREC 7(351-400) and TREC 8(401-450) queries

Three baseline term weighting methodThree baseline term weighting method W1: simple W1: simple idfidf weighting weighting W2: W2: tftf ．． idfidf weighting weighting W3: (1 + W3: (1 + loglog((tftf)))) ．． idfidf weighting weighting

Experiment ResultsExperiment Results

Sense weight parameter α is set to 0.5.Sense weight parameter α is set to 0.5. More improvements is obtained in the long queries More improvements is obtained in the long queries

experiments.experiments. Overgrown weight in W3(W2)+senseOvergrown weight in W3(W2)+sense

Pseudo Relevance FeedbackPseudo Relevance Feedback

Five terms from the top ten documents are Five terms from the top ten documents are selected by the probabilistic term selection selected by the probabilistic term selection method.method.

For the sense fields of the new query terms in For the sense fields of the new query terms in +sense +sense experiments, a voting method is used, experiments, a voting method is used, which the most frequent root sense in the top 10 which the most frequent root sense in the top 10 documents is assigned to the new terms.documents is assigned to the new terms.

Result of Pseudo Relevance Result of Pseudo Relevance FeedbackFeedback

BM25 using Root SensesBM25 using Root Senses

ConclusionsConclusions

A coarse-grained, consistent, and flexible sense A coarse-grained, consistent, and flexible sense tagging method is proposed to improve large-tagging method is proposed to improve large-scale text retrieval performance.scale text retrieval performance.

This approach can be applied to retrieval This approach can be applied to retrieval systems in other languages in cases where there systems in other languages in cases where there are lexical resources much more roughly are lexical resources much more roughly constructed than expensive resources like constructed than expensive resources like WordNet.WordNet.

The proposed sense-field based indexing and The proposed sense-field based indexing and sense-weight oriented ranking do not seriously sense-weight oriented ranking do not seriously increase system overhead.increase system overhead.

ConclusionsConclusions

Experiment results show good performance even Experiment results show good performance even with the relevance feedback method or state-of-with the relevance feedback method or state-of-the-art BM25 retrieval model.the-art BM25 retrieval model.

Verbs should be assigned with senses in future Verbs should be assigned with senses in future work.work.

It is essential to develop an elaborate retrieval It is essential to develop an elaborate retrieval model, i.e., a term weighting model considering model, i.e., a term weighting model considering word senses.word senses.

information retrieval using word senses: root sense tagging approach sang-bum kim, hee-cheol seo and...

Documents

assign root sense

root sense of artifact

root sense of relation

sense pairs

probable sense

sense field

word senses

candidate root senses