multilingual sentiment analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... ·...

58
Multilingual Sentiment Analysis Dr. Furu Wei Natural Language Computing Group, Microsoft Research Asia [email protected]

Upload: ngobao

Post on 10-Jul-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Multilingual Sentiment Analysis

Dr. Furu Wei Natural Language Computing Group,

Microsoft Research Asia

[email protected]

Page 2: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Sentiment Analysis Tutorials

• Sentiment Analysis in Practice Tutorial by Yongzheng Zhang et al. (ebay Research Lab | ICDM | 2011)

• Sentiment Symposium Tutorial by Christopher Potts (Stanford Linguistics | 2011)

• Sentiment Analysis Tutorial by Bing Liu (University of Illinois at

Chicago | AAAI | 2011) • Opinion Mining and Sentiment Analysis: NLP Meets

Social Sciences by Bing Liu (University of Illinois at Chicago | 2010) • Opinion Mining and Summarization by Bing Liu (University

of Illinois at Chicago | WWW | 2008) • Sentiment analysis and opinion mining (survey) by Bo

Pang and Lillian Lee (Cornell University | 2008)

Page 3: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Bo Pang, Lillian Lee. Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval 2(1-2): 1–135. 2008.

Bing Liu. Sentiment Analysis and Opinion Mining (Introduction and Survey), Morgan & Claypool Publishers, May 2012.

Page 4: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Thumb up or down

Positive or negative

Love or hate

Page 5: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Sentiment

• Sentiment := <Holder, Target, Polarity, Auxiliary>

– Holder: who expresses the sentiment

– Target: what/whom the sentiment is expressed to

– Polarity: the nature of the sentiment (positive, negative, or neutral)

– Auxiliary: strength, summary, confidence, time

Page 6: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Sentiment

• Sentiment := <Holder, Target, Polarity, Auxiliary>

– Holder: who expresses the sentiment

– Target: what/whom the sentiment is expressed to

– Polarity: the nature of the sentiment (e.g., positive/negative)

• In his recent State of the Union address, US President Bush quite unexpectedly labeled Iran, Iraq, and the DPRK as an “axis of evil”.

Negative

Page 7: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Sentiment

• The games in iPhone 4s are pretty funny!

Feature/Aspect

Target

Positive

Polarity

Holder = the user/reviewer

Page 8: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Sentiment Analysis

• Computational study of opinions, sentiments, appraisal, and emotions expressed in text

– Reviews, blogs, discussions, microblogs, social networks

• Also known as Opinion Mining

Page 11: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Daily Live Stock Market Prediction and Tracking using Twitter Sentiment

Page 12: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Basic Tasks for Sentiment Analysis

• Holder detection – Find who express the sentiment

• Target recognition – Find whom/what the sentiment is expressed

towards

• Sentiment (Polarity) classification – Positive, negative, neutral

• Opinion summarization

• Opinion spam detection

Page 13: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Lexicon Based Sentiment Classification

• Basic idea

– Use the dominant polarity of the opinion words in the sentence to determine its polarity

– If positive/negative opinion prevails, the opinion sentence is regarded as positive/negative

• Lexicon + Counting

• Lexicon + Grammar Rule + Inference Method

Minqing Hu and Bing Liu. Mining and summarizing customer reviews. KDD: 168-177, 2004. Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and Manfred Stede. Lexicon-Based Methods for Sentiment Analysis. Computational Linguistics: 37(2), 267-307. 2011.

Page 14: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

The General Inquirer Lexicon

• Homepage: http://www.wjh.harvard.edu/~inquirer

• Categories

– Positive (1,915 words) and Negative (2,291 words)

– Strong vs Weak, Active vs Passive, Overstated versus Understated

– Pleasure, Pain, Virtue, Vice, Motivation, Cognitive Orientation, etc

• Free for research use

Philip J. Stone, Dexter C Dunphy, Marshall S. Smith, Daniel M. Ogilvie. 1966. The General Inquirer: A Computer Approach to Content Analysis. MIT Press.

Page 15: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

LIWC (Linguistic Inquiry and Word Count)

• Homepage: http://www.liwc.net/

• 2,300 words, > 70 classes – Affective Processes

• negative emotion (bad, weird, hate, problem, tough) • positive emotion (love, nice, sweet)

– Cognitive Processes • Tentative (maybe, perhaps, guess), Inhibition (block,

constraint)

– Pronouns, Negation (no, never), Quantifiers (few, many)

• $30 or $90 fee

Pennebaker, J.W., Booth, R.J., & Francis, M.E. (2007). Linguistic Inquiry and Word Count: LIWC 2007.

Page 16: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

MPQA Subjectivity Cues Lexicon

• Homepage: http://www.cs.pitt.edu/mpqa/subj_lexicon.html

• 6,885 words from 8,221 lemmas

– 2,718 positive

– 4,912 negative

• Each word annotated for intensity (strong, weak)

• GNU GPL

Theresa Wilson, Janyce Wiebe, and Paul Hoffmann (2005). Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. HLT-EMNLP-2005. Riloff and Wiebe (2003). Learning extraction patterns for subjective expressions. EMNLP-2003.

Page 18: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

SentiWordNet

• Homepage: http://sentiwordnet.isti.cnr.it/

• All WordNet synsets automatically annotated for degrees of positivity, negativity, and neutrality/objectiveness

– [estimable(J,3)] “may be computed or estimated”

• Pos 0 Neg 0 Obj 1

– [estimable(J,1)] “deserving of respect or high regard”

• Pos .75 Neg 0 Obj .25

Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. 2010 SENTIWORDNET 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. LREC-2010.

Page 19: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Disagreements between polarity lexicons

Opinion Lexicon

General Inquirer

SentiWordNet LIWC

MPQA 33/5402 (0.6%)

49/2867 (2%)

1127/4214 (27%)

12/363 (3%)

Opinion Lexicon 32/2411 (1%)

1004/3994 (25%)

9/403 (2%)

General Inquirer 520/2306 (23%)

1/204 (0.5%)

SentiWordNet 174/694 (25%)

LIWC

Christopher Potts, Sentiment Tutorial, 2011

Page 20: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Experiment

Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, and Manfred Stede. Lexicon-Based Methods for Sentiment Analysis. Computational Linguistics: 37(2), 267-307. 2011.

Page 21: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

More on Lexicon Based Methods

• Negation

– Not good vs. bad

• Intensification

– Pretty good, very good, vs. good

• Composition

Page 22: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Machine Learning Based Sentiment Classification

• Basic idea

– Treat sentiment classification simply as a special case of topic-based categorization

• With the two “topics” being positive sentiment and negative sentiment

• Data + Feature + Model

Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP: 79-86, 2002.

Page 23: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Method

• Feature engineering

– Each document d is represented by a feature vector ~d:= (n1(d), n2(d), . . . , nm(d))

– ni(d) could indicate presence, term frequency

• Classification models

– Naive Bayes, Maximum Entropy, SVM

Page 24: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Data

• Movie reviews – Internet Movie Database (IMDb)

• http://www.cs.cornell.edu/people/pabo/movie-review-data/

• http://reviews.imdb.com/Reviews/

– 700 positive / 700 negative

• Experiment setting for ML classifiers – 3-fold cross validation

– Treating punctuation as separate lexical items

– No stemming or stoplists were used

Page 25: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Experimental Results

• Baseline: use a few words written by human to classify

• ML-based methods

Page 26: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

More on Learning Based Methods

• Negation features

– NOT_good, NOT_bad

• N-gram features

• POS tags features

• POS based pattern features

Page 27: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Which method is better

• No conclusion yet – Both are answers in different research papers

– Depend on details, data, domains, scenarios, etc.

• Observations – Lexicon: simple, intuitive, interpretable

– Learning: scale, data-driven, uninterruptable

• Maybe (?) – Sentence: lexicon is better

– Document: learning is better

Page 28: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Multilingual Sentiment Analysis

Page 29: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Task

• Train a sentiment classifier for a foreign language with labeled data on English and unlabeled parallel data on both languages

• Input – Labeled English data 𝐿𝑒

– Unlabeled parallel data 𝑈𝑒𝑓

– Labeled foreign language data 𝐿𝑓 (optional)

• Output – Sentiment classifier on the foreign language 𝐶𝑓

Page 30: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Labeled Corpus in English

(Translated) Labeled Corpus in Chinese

Chinese Sentiment Classifier

Machine Translation Engine

Method: MT-SVM

Page 31: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Labeled Corpus in English

Chinese Sentiment Classifier

English Sentiment Classifier

Machine Translation Engine

(Translated) Labeled Corpus in Chinese

Unlabeled Corpus in English

High Confident Samples in

English

Unlabeled Corpus in Chinese

High Confident Samples in

Chinese

Translated High Confident Samples in

English/Chinese

Machine Translation Engine

Method: MT-CoTraining

Page 32: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Labeled Corpus in English

(Pseudo) Labeled Corpus in Chinese

Chinese Sentiment Classifier

English Sentiment Classifier

Parallel Corpus

Method: Para-SVM

Page 33: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Labeled Corpus in English

Chinese Sentiment Classifier

English Sentiment Classifier

Parallel Corpus

Machine Translation Engine

(Translated) Labeled Corpus in Chinese

Translated High Confident Samples in

English/Chinese

Method: Para-CoTraining

Page 34: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Cross-Lingual Mixture Model for Sentiment Classification (ACL 2012)

Page 35: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Existing Work

• (Wan 2009) uses machine translated text as training data

• (Prettenhofer 2010) projects both languages into the same space with bilingual words mapping and learn classifiers on this space

• (Lu 2011) improves bilingual sentiment classification by enforcing label consistency on parallel corpus

Page 36: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Challenges

• Problems and challenges

– Machine translation engine

• Feature/vocabulary coverage

• Polysemy

– Language projection with bilingual words mapping

• Polysemy

– Label consistency on parallel corpus

• Label consistency is only determined by sentence alignment probability

Page 37: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Answers to Challenges

• Improve feature/vocabulary coverage with parallel corpus

• Use word alignment in parallel corpus to choose mapping for polysemy

• Determine label consistency in parallel corpus with word alignment

Page 38: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Cross-Language Mixture Model

• Word level assumption in parallel sentences – The aligned words between English and Chinese have the

same functions for determining the sentiment polarity for sentences

• Generative model – The process of sentence generation:

• Select a polarity label wrt. prior distribution • Select words

– Generate a Chinese word according to the polarity label – Generate a Chinese word by projecting an English word with the same

polarity

– Train the parameters by maximizing the likelihood of the large unlabeled parallel corpus and the labeled monolingual data (English with/without Chinese)

Page 39: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Cross-Language Mixture Model

c

weu wfu

wel

Le

Uef

c c

I

×

𝑃 𝑈𝑒𝑓 𝜃1, 𝜃2) 𝑃 𝐿𝑒𝑓 𝐶𝑒, 𝜃1, 𝜃2) × Labeled English data Unlabeled Parallel data

Language Confidence

Page 40: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Parameter Estimation 𝑎𝑟𝑔𝑚𝑎𝑥𝜃

Generating a Chinese word according to a polarity

Generating a Chinese word by projectingan English word with same polarity

Page 41: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Parameter Estimation

Projection probability estimated by word

alignment probability

𝑎𝑟𝑔𝑚𝑎𝑥𝜃

Probability of generating a word w for polarity c

estimated by EM

Page 42: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

EM

E-Step

M-Step

Page 43: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Experiments

Methods NTCIR-EN NTCIR-CH

MPQA-EN NTCIR-CH

MT-SVM 62.34 54.33

Para-SVM N/A N/A

MT-CoTrain 65.13 59.11

Para-CoTrain 67.21 60.71

CLMM 70.96 71.52

Classification Result using Only English Labeled Data

Note: • # of parallel sentences: 20,000 • MT engine: Microsoft translator • NTCIR-EN (English labeled corpus): 4,294 • NTCIT-CH (Chinese labeled corpus): 1,739 • MPQA-EN (English labeled corpus): 4,598

Page 44: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Experiments

Methods NTCIR-EN NTCIR-CH

MPQA-EN NTCIR-CH

SVM 80.58 80.58

MT-CoTrain 82.28 80.93

Para-CoTrain 82.35 82.18

CLMM 82.73 83.02

Classification Result using English and Chinese Labeled Data

Page 45: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Lost in Translations? Building Sentiment Lexicons Using Context Based Machine

Translation (COLING 2012)

Page 46: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Motivation

• Sentiment lexicons are very important for sentiment analysis

• Many lexicons (Ref. P14-P18) in English, but few or even unavailable for other languages

Page 47: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Task

• Use sentiment lexicons in English to automatically generate sentiment lexicons in other languages

Page 48: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Existing Work

• Straightforward translation

– Suffer from low sentiment word coverage in the bilingual dictionaries

– Two or more English sentiment words often are translated to the same foreign word

– Smaller translated sentiment lexicons than the original ones

• English -> Romanian (Mihalcea et al., 2007)

– 6,856 -> 4,983

Page 49: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Generate Sentiment Lexicon using Context-Aware Translation

• We put the English words into different contexts to effectively prompt the machine translation engine to query the large scale parallel corpora that it is trained on

• We can take advantage of the polysemy of words; one word can mean different things and it usually has various target language translations

Page 50: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Pipeline

Page 51: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Context Generation

Page 52: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations
Page 53: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Experimental Results

Page 54: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Experimental Results

Page 55: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Summary

• We present a very simple but effective approach to automatically generating sentiment lexicons for other languages with English sentiment lexicons

• The approach is language independent

Page 56: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Future Work

• Involve more languages

• Benchmark data set for multilingual sentiment lexicons and classifiers

Page 57: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Position at MSRA

• Researcher and RSDE

• Intern

• Visit professor (铸星计划)

• Ming Zhou ([email protected]); Furu Wei ([email protected])

Page 58: Multilingual Sentiment Analysisbcmi.sjtu.edu.cn/yccl2012/wp-content/uploads/2012/11/... · 2017-12-05 · Bo Pang, Lillian Lee. OpinionBing Mining and SentimentMining Analysis. Foundations

Thanks