adapting sentiment lexicons using contextual semantics for sentiment analysis of twitter

26
Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter Hassan Saif, Yulan He, Miriam Fernandez and Harith Alani Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom 1st Workshop on Semantic Sentiment Analysis Greece, Crete 2014

Upload: knowledge-media-institute-the-open-university

Post on 26-Jan-2015

126 views

Category:

Science


0 download

DESCRIPTION

Sentiment lexicons for sentiment analysis offer a simple, yet effective way to obtain the prior sentiment information of opinionated words in texts. However, words' sentiment orientations and strengths often change throughout various contexts in which the words appear. In this paper, we propose a lexicon adaptation approach that uses the contextual semantics of words to capture their contexts in tweet messages and update their prior sentiment orientations and/or strengths accordingly. We evaluate our approach on one state-of-the-art sentiment lexicon using three different Twitter datasets. Results show that the sentiment lexicons adapted by our approach outperform the original lexicon in accuracy and F-measure in two datasets, but give similar accuracy and slightly lower F-measure in one dataset.

TRANSCRIPT

Page 1: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Adapting Sentiment Lexicons using Contextual Semantics for Sentiment

Analysis of TwitterHassan Saif, Yulan He, Miriam Fernandez and Harith Alani

Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom

1st Workshop on Semantic Sentiment AnalysisGreece, Crete 2014

Page 2: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

• Sentiment Analysis

• Sentiment Analysis Approaches

• Sentiment Lexicons on Twitter

• Sentiment Lexicon Adaptation Approach

• Evaluation

• Conclusion

Outline

Page 3: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

“Sentiment analysis is the task of identifying positive and negative opinions, emotions and evaluations in text”

3

Opinion OpinionFact

Sentiment Analysis

yes, It is sunny, but also very humid :(

The weather is great today :)

I think its almost 30 degrees today

Page 4: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

oRich

o Formal Language {Well

Structured Sentences}

oDomain Specific

Conventional Text

Page 5: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Twitter Data

o Short (140-Chars)

oNoisy {gr8, lol, :), :P}

oOpen Environment

Page 6: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

I had nightmares all night long last night :(

Negative

Sentiment Lexicon

Text Processing Algorithm

Sentiment Analysis

The Lexicon-based Approach

great successsad

pretty

down

wronghorrible

beautiful

mistake

love

good

Sentiment Analysis

Page 7: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Sentiment Lexicons- Lists of Opinionated:

- Words and Phrases (MPQA, SentiWordNet, etc)- Common Sense Concepts (SenticNet)

- Built:- Manually- Dictionary-based Approach- Corpus-based Approach

- Applied to Conventional Text- Movie Reviews, News, Blogs, Open Forums, etc.

Page 8: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Sentiment Lexicons on Twitter

Twitter Data- Language Variations

- New Words

- Noisy Nature - lol, gr8, :), :P

Traditional Lexicons- Not tailored to Twitter

noisy data

- Fixed number of words

Page 9: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Twitter-specific Sentiment Lexicons

- Such as: Thelwall-Lexicon

- Built to specifically work on social data - Contain lists of emoticons, slangs, abbreviations, etc.

- Coupled with rule-based method, SentiStrength- Apply text pre-processing routine on tweets

Page 10: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Twitter-specific Sentiment Lexicons

Offer Context-Insensitive Prior Sentiment Orientations and Strength of words

..and Traditional Lexicons

Great

Problem Smile

Sentiment Lexicon

great successsad

pretty

down

wronghorrible

beautiful

mistake

love

good

Positive

Page 11: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Lexicons Adaptation Approaches

Require Training from Labeled Corpora

Supervised Unsupervised

Use General Textual Corpora (e.g., WEB)

or Static lexical knowledge sources (e.g., WordNet)

Page 12: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Contextual Semantic Adaptation Approach

Unsupervised Approach

Captures the Contextual Semantics of words

To assign Contextual Sentiment

Page 13: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Contextual Semantics of Words

“Words that occur in similar context tend to have similar meaning”Wittgenstein (1953)

GreatProblem

Look SmileConcert

Song

WeatherLoss

Game Taylor Swift

AmazingGreat

Page 14: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Capturing Contextual Semantics

Term (m) C1 C2 Cn….

Context-Term Vector

Degree of Correlation

Prior SentimentSentiment Lexicon

(1)

(2)Great

Smile Look

SentiCircles Model

(3)

Contextual Sentiment Strength

Contextual Sentiment Orientation

Positive, Negative Neutral

[-1 (very negative)+1 (very positive)]

Page 15: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Capturing Contextual Semantics

Term (m) C1

Degree of Correlation

Prior Sentiment

Great

Smile

SentiCircles Model

X = R * COS(θ)

Y = R * SIN(θ)

Smile

X

ri

θi

xi

yi

Great

PositiveVery Positive

Very Negative Negative

+1

-1

+1-1 Neutral Region

ri = TDOC(Ci)

θi = Prior_Sentiment (Ci) * π

Page 16: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

SentiCircles (Example)

Page 17: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Overall Contextual Sentiment

Ci

X

ri

θi

xi

yi

m

PositiveVery Positive

Very Negative Negative

+1

-1

+1-1 Neutral Region

Senti-Median of SentiCircle

Sentiment Function

Page 18: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Lexicon Adaptation Method

• A set of Antecedent-Consequent Rules

• Decides on the new sentiment of a term based on:– How Weak/Strong its Prior Sentiment – How Weak/Strong its Contextual Sentiment• Based on the Position of the term’s SentiMedian

Page 19: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Thelwall-LexiconCase Study fiery -2fiery -2vex* -3fiery -2witch -1inspir* 3fiery* -2trite* -3fiery -2cunt* -4fiery -2fiery* -2intelligent* 2fiery -2joll* 3fiery* -2fiery* -2suffers -4fiery -2loved 4insidious* -3despis* -4fiery* -2hehe* 2

Positive Negative Neutral0

500

1000

1500

2000

2500

398

1919

229

• Consists of 2546 terms• Coupled with prior sentiment strength between |1| and |5|

[-2, -5] negative term[2, 5] positive term[-1, 1] neutral term

Page 20: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Adaptation Rules on Thelwall-Lexicon

Prior Sentiment < -3 (week negative)

Revolution

Contextual Sentiment = Neutral Change to Neutral

Rule 10

Page 21: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Experiments• Sentiment Lexicon

– Thelwall-Lexicon

• Settings:– Update Setting– Expand Setting– Update + Expand Setting

• Datasets

• Binary Sentiment Classification– SentiStrength

• Lexicon-based Method• Work on Thelwall-Lexicon

Page 22: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Results

Adaptation Impact on Thelwall-Lexicon

Page 23: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Results

Cross comparison results of the original and the adapted lexicons

Page 24: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Adapted Lexicons on HCRPerformance

Precision Recall F1353739414345

Positive Sentiment Detection

Original UpdatedUpdated+Expanded

Sentiment Class Distribution

OMD HCR STS-Gold0.35

0.4

0.45

0.5

0.55

0.6

Positive to Negative Ratio

Impact on Thelwall-Lexicon

OMD HCR STS-Gold10121416182022242628 New Words Added To Thelwall-Lexicon

Page 25: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Conclusion• We proposed an unsupervised approach for sentiment lexicon

adaptation from Twitter data.

• It update the words’ prior sentiment orientations and/or strength based on their contextual semantics in tweets

• The evaluation was done on Thelwall-Lexicon using three Twitter datasets.

• Results showed that lexicons adapted by our approach improved the sentiment classification performance in both accuracy and F1 in two out of three datasets.

Page 26: Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis of Twitter

Thank You

Email: [email protected]: hrsaifWebsite: tweenator.com