second language learning from news websites word sense disambiguation using word embeddings

Second Language Learning From News

WebsitesWord Sense Disambiguation using Word Embeddings

Workflow1. Identify words on the page for the learner to learn2. Select an contextually appropriate translation for the

words3. Replace those words with the translations on the

article4. User can click on the word to learn more about it

Motivation• Conducted a pilot study from May-Aug 2015• Biggest issue found was the poor quality of translations

Workflow1. Identify words on the page for the learner to learn2. Select an contextually appropriate translation for the

words3. Replace those words with the translations on the

article4. User can click on the word to learn more about it

Word Sense Disambiguation

WordNews: Identifying the correct translation of an English word given the contextWSD: Identifying the correct sense of an English word given the context

More specifically, our task is Cross-Lingual

Word Sense DisambiguationNavigli (2009) : Computational identification of meaning for words in context• Evaluation using Senseval/Semeval tasks• Open problem• Variations:

• Lexical Sample vs All words• Fine-grained vs coarse-grained

Existing Approaches• Supervised vs unsupervised• Knowledge-rich vs Knowledge-poor

• Knowledge can be in the form of WordNet, dictionaries

• IMS is a supervised knowledge-poor system

Features used in IMS• Local Collocations• POS tags• Surrounding Words

Word Embeddings• Representation of a word as a vector in a low-

dimension space. • Vectors similarity correlate with semantic similarity.• For example, in Word2Vec,

• vector('king') - vector('man') + vector('woman') is close to vector('queen')

Taken from http://deeplearning4j.org/word2vec.html12

Word Embeddings for WSDTurian et al. (2010) presented a method of using word

embeddings as an unsupervised feature in supervised NLP systems.

• Taghipour and Ng (2015) used Collobert and Weston’s embeddings as a feature type in IMS

Turian, Joseph, Lev Ratinov, and Yoshua Bengio. "Word representations: a simple and general method for semi-supervised learning."

Progress Made• Use Word Embeddings in IMS• Evaluate using Senseval-2 and Senseval-3 Lexical

Sample task• Integrate IMS with WordNews

Implementation of feature typeTried to replicate Taghipour and Ng’s (2015) work, but unable to completely replicate results. Used a different approach.

Taghipour and Ng’s (2015) approach:Concatenate surrounding vectors to form d * (w-1) dimensions

My approach:Sum up vectors of surrounding words to form d dimensions

Each dimension is used as a feature15

Implementation of feature typeTaking zinc syrup, tablets or lozenges can lessen the severity and duration of the common cold, experts believe.

Implementation of feature type• Turian et al. (2010) suggested we should scale the

standard deviation down to a target standard deviation. • This prevents it from getting a much higher influence than

the binary features.

• Implemented a variant of this done by Taghipour and Ng (2015)

• Target standard deviation for each dimension

Features used in IMS• Local Collocations• POS tags• Surrounding Words• Word Embedding

Evaluation: Comparison of word embeddings

Method Senseval-2 Senseval-3

Collobert and Weston, sigma = 0.1

0.672 0.739

Collobert and Weston, sigma = 0.05

0.664 0.735

Word2Vec, sigma=0.1 0.663 0.733

Word2Vec, sigma=0.05 0.676 0.744

GloVe, sigma =0.1 0.678 0.741

GloVe, sigma=0.05 0.674 0.738

Evaluation: Word Embeddings

This validates our use of word embeddings for this task, as both top and worst systems using word embeddings give good results

Method Senseval-2 Senseval-3

IMS+ Word2Vec, sigma=0.1 0.663 0.733

IMS + GloVe, sigma=0.1 0.678 0.741

IMS 0.653 0.726

Rank 1 System 0.642 0.729

MFS (Most Frequent Sense) 0.476 0.552

Integration of IMS with WordNews

Future work• Adapt word embeddings for WSD • Evaluate our system on a gold-standard human

annotated dataset• Perform a Longitudinal study

• Extrinsic evaluation of WSD with real users on our system• Usability of our system

• Improving selection of words

SummaryWSD using word embeddingsUsed word embeddings as a feature type in IMS: sum up

the word vectors of the surrounding wordsEvaluated on Senseval-2 and Senseval-3’s lexical sample

taskFuture work

second language learning from news websites word sense disambiguation using word embeddings

Documents

word relations and word sense disambiguation

advances in word sense disambiguation

dependency-based word embeddings

robust – word sense disambiguation exercise

word embeddings -...

word meaning vector semantics & embeddings

dynamic word embeddings - arxiv.org

lecture: word sense disambiguation

combining word and entity embeddings for entity...

word sense disambiguation

word sense discovery and disambiguation

similarity-based word sense disambiguation

word sense disambiguation a survey

text mining, word embeddings, & wikipedia

word-sense disambiguation

dense word embeddings

unsupervised word sense disambiguation rivaling …

from word embeddings to document...

learning word embeddings from tagging data: a...

word embeddings - cocoxu.github.io