using wordnet to retrieve words from their meanings İlknur durgar el-kahlout and kemal oflazer...
TRANSCRIPT
![Page 1: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/1.jpg)
USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS
İlknur Durgar El-Kahlout and Kemal Oflazer
Sabancı Universityİstanbul, Turkey
![Page 2: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/2.jpg)
Problem
For a given definition, find the appropriate word (or words)
Traditional dictionary is of no use From a dictionary, find an appropriate
word that has a “similar” definition
![Page 3: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/3.jpg)
Examples User definition:
Akımı ölçmek için kullanılan alet(A device that is used to measure the currenta)
In the dictionary:akımölçer: elektrik akımının şiddetini
ölçmeye yarayan araç, ampermetre(ammeter: a device that measures the intensity
of electrical current, amperemeter)
?
![Page 4: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/4.jpg)
Applications Computer-assisted language
learning Solving crossword puzzles Reverse dictionary
![Page 5: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/5.jpg)
Outline Problem statement Meaning-to-Word System (MTW) Our Approach Methods Results Result Summary Conclusion
![Page 6: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/6.jpg)
Problem Statement Find the “similarity” between two
definitionsAkımı ölçmek için kullanılan alet
(A device that is used to measure the current)
Elektrik akımının şiddetini ölçmeye yarayan araç, ampermetre
(a device that measures the intensity of electrical current, amperemeter)
![Page 7: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/7.jpg)
Meaning-to-Word (MTW) addresses the problem of finding
the appropriate word (or words), whose meaning “matches” the given definition
Two subproblems finding words whose definitions are
"similar" to the query in some sense ranking the candidate words using a
variety of ways
![Page 8: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/8.jpg)
User Definition
Search in Dictionary
Rank Candidates
query
candidates
List of words
Information Flow in MTW
![Page 9: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/9.jpg)
Available Resources
Turkish Monolingual Dictionary About 50.000 entries
Turkish WordNet About 11.000 synsets
![Page 10: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/10.jpg)
User Definition
Search in Dictionary
Rank Candidates
query
candidates
List of words
Normalization
Normalization
![Page 11: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/11.jpg)
Normalization
Tokenization Stemming Stop Word Elimination
![Page 12: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/12.jpg)
User Definition
Search in Dictionary
Rank Candidates
query
candidates
List of words
Query Processing
Query Processing
![Page 13: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/13.jpg)
Query Processing Subset Generation
Search with different set of words Select informative words from user’s
queryQuery: daha önce hiç evlenmemiş kişi (a person who
has never been married)
{önce, evlen, kişi} (before, marry, person)
{evlen, kişi}, {önce, kişi}, {önce, evlen} (marry, person) (before, person) (before, marry)
{evlen}, {önce}, {kişi} (marry) (before) (person)
![Page 14: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/14.jpg)
Query Processing
Subset Sorting Unordered list of subsets are
insufficient Rank the generated subsets
1) By the number of words{önce, evlen, kişi} (before, marry, person)
{evlen, kişi} (marry, person)
2) By the sum of frequency logarithm{evlen, kişi} (marry, person)
{önce, kişi} (before, person)
![Page 15: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/15.jpg)
User Definition
Search in Dictionary
Rank Candidates
query
candidates
List of words
Searching for Meanings
![Page 16: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/16.jpg)
Searching for Meanings Two methods
Stem Matching Query Expansion (using WordNet)
![Page 17: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/17.jpg)
Stem Matching Morphological normalization of
words Find meanings that contain
morphological variants of the original definition
![Page 18: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/18.jpg)
Stem Matching (Ex.)(A device that is used to measure the current)
{ akımı ölçmek için kullanılan alet }
ak (white) ölç(measure) için(to) kullan(use) alet (device)
akım(current) iç(drink) kul (slave)
akı (flux)
Colored stems are the matching ones
![Page 19: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/19.jpg)
Stem Matching
(A device that is used to measure the current)
akımı ölçmek için kullanılan alet
elektrik akımının şiddetini ölçmeye yarayan araç, ampermetre
(a device that measures the intensity of electrical current, amperemeter)
![Page 20: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/20.jpg)
Stem Matching
(A device that is used to measure the current)
akımı ölçmek için kullanılan alet
elektrik akımının şiddetini ölçmeye yarayan araç, ampermetre
(a device that measures the intensity of electrical current, amperemeter)
![Page 21: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/21.jpg)
Drawbacks Generate noisy stems ilim (science, my city) ilim (science), il (city)
Conflate two words with very different meanings to the same stem
ilim (science, my city), ilde (in the city) il (city)
Cannot find relations between similar words
kimse (someone) kişi (person)
bölüm (part) kısım (portion)
Stem Matching
![Page 22: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/22.jpg)
Using Query Expansion Two different approaches:
Expand query with relations (synonyms, specializations, generalizations)
Expand query with unexpanded query’s relevant answers
WordNet synonyms are used in MTW
{besin, gıda} (food, nourishment) {iyileş, düzel} (to get better) /{iyileş, geliş} (to
improve)
![Page 23: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/23.jpg)
Query Expansion (Ex.)(A device that is used to measure the current)
{ akımı ölçmek için kullanılan alet }
ak (white) ölç(measure) için(to) kullan(use) alet (device)
akım(current) iç(drink) kul (slave)
akı (flux)
beyaz faydalan araç
debi yararlan gereç
akış köle
![Page 24: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/24.jpg)
Query Expansion (Ex.)(A device that is used to measure the current)
akımı ölçmek için kullanılan alet
elektrik akımının şiddetini ölçmeye yarayan araç, ampermetre
(a device that measures the intensity of electrical current, amperemeter)
![Page 25: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/25.jpg)
Query Expansion (Ex.)(A device that is used to measure the current)
akımı ölçmek için kullanılan alet
elektrik akımının şiddetini ölçmeye yarayan araç, ampermetre
(a device that measures the intensity of electrical current, amperemeter)
![Page 26: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/26.jpg)
User Definition
Search in Dictionary
Rank Candidates
query
candidates
List of words
Ranking
![Page 27: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/27.jpg)
Ranking Very important part of MTW
Having the right answer in the retrieved set is not enough
Aim is to have the right answer at top of the retrieved set (Ex: in first top 50 answers)
![Page 28: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/28.jpg)
Ranking Simple but effective methods
Number of matched words Subset informativeness - frequency of
words in the subset Ratio of number of matched words to
the number of words in the candidate dictionary definition
Longest Common Subsequence - order of the matched words
![Page 29: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/29.jpg)
Some Statistics Training sets:
50 queries from users 50 queries from a dictionary
Test sets: 50 queries from users 50 queries from a separate dictionary
Test set 1 (user)
Training set 1
Test set 2 (dict.)
Training set 2
# of queries 50 50 50 50
Avg. # of query words
5.66 4.64 9.24 13.98
Max. # of query words
17 12 23 45
Min. # of query words
2 1 1 6
![Page 30: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/30.jpg)
Rank Test set 1
Training set 1
Test set 2
Training set 2
1-10 13 (26%)
18 (36%)
45 (90%)
41 (82%)
11-50 7 (14%) 12 (24%)
2 (4%) 5 (10%)
>50 19 (38%)
10 (20%)
3 (6%) 4 (8%)
Not found
11 (22%)
10 (20%)
0 (0%) 0 (0%)
Stem Matching all stems included
Low % in top 10 in user queries but very high results in dictionary queries
![Page 31: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/31.jpg)
Stem Matching
Rank Test set 1
Training set 1
Test set 2
Training set 2
1-10 14 (28%)
21 (42%)
46 (92%)
43 (86%)
11-50 5 (10%) 9 (18%) 1 (2%) 5 (10%)
>50 18 (36%)
9 (18%) 3 (6%) 2 (4%)
Not found
13 (26%)
11 (22%)
0 (0%) 0 (0%)
longest stem included (heuristics)
Improvement in user queries, slightly better performance in dictionary queries
![Page 32: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/32.jpg)
Query Expansion (WordNet)
Rank Test set 1
Training set 1
Test set 2
Training set 2
1-10 14(28%)
24 (48%)
45 (90%)
41 (82%)
11-50 9 (18%) 9 (18%) 2 (4%) 5 (10%)
>50 18 (36%)
12 (24%)
3 (6%) 4 (8%)
Not found
9 (18%) 5 (10%) 0 (0%) 0 (0%)
all stems included
Better results in user queries, no change in dictionary queries
![Page 33: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/33.jpg)
Query Expansion (WordNet)
Rank Test set 1
Training set 1
Test set 2
Training set 2
1-10 14 (28%)
24 (48%)
41 (82%)
39 (78%)
11-50 6 (12%) 8 (16%) 5 (10%) 6 (12%)
>50 21 (42%)
13 (26%)
1 (2%) 5 (10%)
Not found
9 (18%) 5 (10%) 0 (0%) 0 (0%)
longest stem included (heuristics)
Better performance than ‘longest stem matching’ in user queries, but worse performance in dictionary queries
![Page 34: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/34.jpg)
Result Summary Stem Matching (longest stem
included) 60% success in real user queries 96% success in dictionary queries
Query Expansion (all stems included) 68% success in real user queries 92% success in dictionary queries
![Page 35: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/35.jpg)
Conclusion We have implemented a ‘Meaning to
Word’ system for Turkish Results on unseen data are rather
satisfactory Query expansion is better
Although, it cannot find the words for all queries
68% of real user queries and 90% of dictionary queries are found in the first 50 results
![Page 36: USING WORDNET TO RETRIEVE WORDS FROM THEIR MEANINGS İlknur Durgar El-Kahlout and Kemal Oflazer Sabancı University İstanbul, Turkey](https://reader035.vdocuments.net/reader035/viewer/2022062621/551c1902550346b24f8b57ef/html5/thumbnails/36.jpg)
THANK YOU !