use of wordnet and on-line dictionaries to build en-sk synsets (experimental tool)

19
Use of WordNet and on- line dictionaries to build EN-SK synsets (experimental tool) Ján GENČI Technical University of Košice, Slovakia [email protected]

Upload: jelani-rios

Post on 31-Dec-2015

45 views

Category:

Documents


0 download

DESCRIPTION

Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool). J á n GEN Č I Technical University of Ko šice , Slovakia [email protected]. Plan. WordNet, EuroWordNet + Slovak language Motivation Solution Results Future plans. WordNet, EuroWordNet. Well known projects - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

Use of WordNet and on-line dictionaries to build EN-SK

synsets(experimental tool)

Ján GENČI

Technical University of Košice, Slovakia

[email protected]

Page 2: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

2

Plan

• WordNet, EuroWordNet + Slovak language

• Motivation

• Solution

• Results

• Future plans

Page 3: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

3

WordNet, EuroWordNet

• Well known projects• WordNet defines meaning of English words

and their relationships (it defines synsets)• EuroWordNet (EWN) is very similar

multilingual project

• EWN doesn’t contain Slovak language (Slovak WN)

Page 4: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

4

Motivation

• Text classification tasks require reduction of dimensionality and Intelligent search – Morphological database– Something like WordNet

Page 5: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

5

Our approach

• We decided to try to use on-line dictionaries to map Slovak meanings to Wordnet synset entries

• Two approaches:– Intersection of translation of each member of

EN synset– Intersection of translation of related words

Page 6: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

6

Architecture

Input word

WordNet DB local DB

Synset

Builder

Inetonline dict.

Page 7: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

7

Synset “members” translation

• According WN word computer has 2 meanings specified by 2 synsets– {computer, computing machine,computing device,

data processor,electronic computer, information, processing system}

– {calculator, reckoner, figurer, estimator, computer}

• Result is formed as intersection of translation of synset members

Page 8: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

8

Translation of related words

• Based on hyponym/hyperonym relationship between words:– Related words are translated– Result is formed as intersection of partial

translations

Page 9: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

9

Results

• We provide 4 Slovak and 2 Czech on-line dictionaries (Slovak dictionaries seem to be from one source)

• Result depends on:– Number of members in the synset (1 is

problem)– Related words– Quality(?) of dictionary

Page 10: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

10

Results (cont.)

• Parts of speech are sometimes mixed (nouns and adjectives)

• We implemented “multilingual view”

• Time consuming approach (quite slow) – results are stored to the database

Page 11: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

11

Examples

word computer

Page 12: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

12

Page 13: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

13

Page 14: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

14

Page 15: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

15

Example

word table

Page 16: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

16

Page 17: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

17

Future works (plans)

• To deal with “dictionary problem”

• To eliminate mixed parts of speech in the results (at least for Slovak language, using morphological database)

• To connect other languages

Page 18: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

18

• Local copy of new webpage

• Addresses– http://ruzin.fei.tuke.sk/~laposp– http://ruzin.fei.tuke.sk/~sudynova (new one)

Page 19: Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool)

19

Thank you!