sentiwordnet [iit-bombay]

25
Paper Presentation SentiWordNet by Andrea Esuli and Fabrizio Sebastiani Sagar Ahire [133050073]

Upload: sagar-ahire

Post on 15-Jan-2015

627 views

Category:

Technology


2 download

DESCRIPTION

A presentation describing Sentiwordnet - a dictionary of synsets annotated with their sentiment.

TRANSCRIPT

Page 1: Sentiwordnet [IIT-Bombay]

Paper PresentationSentiWordNet by

Andrea Esuli and Fabrizio Sebastiani

Sagar Ahire [133050073]

Page 2: Sentiwordnet [IIT-Bombay]

Roadmap● Introduction to Sentiment Analysis● Introduction to Sentiwordnet● Building of Sentiwordnet● Enhancements in 3.0

Page 3: Sentiwordnet [IIT-Bombay]

Roadmap: We Are Here● Introduction to Sentiment Analysis● Introduction to Sentiwordnet● Building of Sentiwordnet● Enhancements in 3.0

Page 4: Sentiwordnet [IIT-Bombay]

Introduction to Sentiment Analysis● The task of identifying the opinion expressed

by a document.● Can be carried out at various levels:

○ Word level○ Sentence level○ Document level○ Aspect level, etc.

Page 5: Sentiwordnet [IIT-Bombay]

Tasks in Sentiment Analysis● Determining Text SO-Polarity

○ Subjective vs. Objective● Determining Text PN-Polarity

○ Positive vs. Negative● Determining Strength of Text PN-Polarity

○ Weakly Positive vs. Strongly Positive○ Weakly Negative vs. Strongly Negative○ Star Rating

Page 6: Sentiwordnet [IIT-Bombay]

Tasks in Sentiment Analysis● Determining Text SO-Polarity

○ Subjective vs. Objective● Determining Text PN-Polarity

○ Positive vs. Negative● Determining Strength of Text PN-Polarity

○ Weakly Positive vs. Strongly Positive○ Weakly Negative vs. Strongly Negative○ Star Rating

Page 7: Sentiwordnet [IIT-Bombay]

Tasks in Sentiment Analysis● Determining Text SO-Polarity

○ Subjective vs. Objective● Determining Text PN-Polarity

○ Positive vs. Negative● Determining Strength of Text PN-Polarity

○ Weakly Positive vs. Strongly Positive○ Weakly Negative vs. Strongly Negative○ Star Rating

Page 8: Sentiwordnet [IIT-Bombay]

Roadmap: We Are Here● Introduction to Sentiment Analysis● Introduction to Sentiwordnet● Building of Sentiwordnet● Enhancements in 3.0

Page 9: Sentiwordnet [IIT-Bombay]

Introduction to Sentiwordnet● Sentiwordnet is a sentiment lexicon

associating sentiment information to each wordnet synset.

● Sentiwordnet = Wordnet + Sentiment Information

Page 10: Sentiwordnet [IIT-Bombay]

Sentiment InformationFor each wordnet synset s, the following information is available in Sentiwordnet:● Positive Score Pos(s)● Negative Score Neg(s)● Objective Score Obj(s)

Pos(s) + Neg(s) + Obj(s) = 1

Page 11: Sentiwordnet [IIT-Bombay]

Roadmap: We Are Here● Introduction to Sentiment Analysis● Introduction to Sentiwordnet● Building of Sentiwordnet● Enhancements in 3.0

Page 12: Sentiwordnet [IIT-Bombay]

Building Sentiwordnet● Trained a set of 8 ternary (P vs. N vs. O)

classifiers, differing in○ Training Set○ Learning Algorithm

● Scored each synset based on no of classifiers:○ P score = No of classifiers stating Positive / 8○ N score = No of classifiers stating Negative / 8○ O score = No of classifiers stating Objective / 8

Page 13: Sentiwordnet [IIT-Bombay]

Classifiers: Training Sets● Used semi-supervised approach starting

with a seed set of paradigmatic synsets (such as nice, nasty, etc.)

● Performed ‘k’ iterations of expansion using Wordnet lexical relations○ Direct antonymy○ Similarity○ Derived from○ Pertains to○ Attribute○ Also see

Page 14: Sentiwordnet [IIT-Bombay]

Classifiers: Training Sets● Obtained 4 training sets for the following ‘k’:

○ 0○ 2○ 4○ 6

Page 15: Sentiwordnet [IIT-Bombay]

Classifiers: Learning Algorithms● The learning algorithms used were:

○ SVM○ Rocchio

● Thus all combinations of 4 training sets and 2 learners yield 8 classifiers

Page 16: Sentiwordnet [IIT-Bombay]

Classifiers: Assigning Categories● Each ternary classifier is a sum of 2 binary

classifiers:○ Positive vs. Not Positive○ Negative vs. Not Negative

● Categories are assigned as:

P NP

N Objective Negative

NN Positive Objective

Page 17: Sentiwordnet [IIT-Bombay]

Classifiers: Observations● Effect of ‘k’:

○ Low ‘k’ -> Low Recall, High Precision○ High ‘k’ -> High Recall, Low Precision

● Effect of learning algorithm:○ SVM -> Favours set with higher cardinality○ Rocchio -> Equal prior probabilities

Page 18: Sentiwordnet [IIT-Bombay]

Statistical Results:Average Scores

Part of Speech Positive Negative Objective

Adjectives 0.106 0.151 0.743

Names 0.022 0.034 0.944

Verbs 0.026 0.034 0.940

Adverbs 0.235 0.067 0.698

All 0.043 0.054 0.903

Page 19: Sentiwordnet [IIT-Bombay]

Roadmap: We Are Here● Introduction to Sentiment Analysis● Introduction to Sentiwordnet● Building of Sentiwordnet● Enhancements in 3.0

Page 20: Sentiwordnet [IIT-Bombay]

Random Walk● Views Wordnet as a graph and performs

random walk on it● Updates P, N and O values till process

converges● Edge from s1 to s2 if s1 occurs in gloss of s2

Page 21: Sentiwordnet [IIT-Bombay]

Random Walk● Two random walks are performed:

○ P Score○ N Score

● O Score is assigned so that P + N + O = 1

Page 22: Sentiwordnet [IIT-Bombay]

WebsiteSentiwordnet is available at:http://sentiwordnet.isti.cnr.it

Page 23: Sentiwordnet [IIT-Bombay]

Major References● SentiWordNet: A Publicly Available Lexical

Resource for Opinion Mining by Andrea Esuli, Fabrizio Sebastiani, 2006

● SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining by Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani, 2010

Page 24: Sentiwordnet [IIT-Bombay]

Other References● Sentiment Analysis and Opinion Mining by Bing Liu,

2012

Page 25: Sentiwordnet [IIT-Bombay]

Further Plan● Wordnet-Affect (2004) by Carlo Strapparava,

Alessandro Valitutti in proceedings of the 4th International Conference of Language Resources and Evaluation (LREC), Lisbon - IN PROGRESS

● Lexicon-based Methods in Sentiment Analysis (2011) by Maite Taboada, Julian Brooke, Milan Tofiloski, Kimberly Voll, Manfred Stede in the Journal of Computational Linguistics