sentiment detection
DESCRIPTION
Sentiment Detection. Rik Sarkar (03305048) Kedar Godbole (03305805). Outline. Sentiment detection: the problem statement Difficulties in sentiment detection Approaches to sentiment detection Conclusion Project proposal. Problem Statement. - PowerPoint PPT PresentationTRANSCRIPT
Sentiment Detection
Rik Sarkar (03305048)
Kedar Godbole (03305805)
Outline
Sentiment detection: the problem statement Difficulties in sentiment detection Approaches to sentiment detection Conclusion Project proposal
Problem Statement
Detect the polarity about a particular topic in a document
Polarity:
- Positive
- Negative
- Mixed
- Neutral
Motivation
Reviews on the Web
Opinions about a product Opinions about the individual aspects of a
product Movie/book reviews Feedback/evaluation forms
Issues
Reference to multiple objects in the same document
- The NR70 is trendy. T-Series is fast becoming
obsolete. Dependence on the context of the document
- “Unpredictable” plot ; “Unpredictable” performance
Negations have to be captured
- Monochrome display is not what the user wants
Issues (contd.)
Metaphors/Similes
- The metallic body is solid as a rock
Part-of and Attribute-of relationships
- The small keypad is inconvenient
Absence of a polar word
- How can someone sit through this seminar?
Approaches to Sentiment Detection
Based on pre-selected sets of words Naive Bayes Support Vector Machines Unsupervised learning Enhancement by NLP
An Unsupervised Learning Technique
Extract phrases from the review based on patterns of POS tags
JJ – Adjective RB – Adverb NN – Noun
First word Second word
JJ NN
RB JJ
JJ JJ
NN JJ
Unsupervised Learning
)2()1(
)2&1(log
wordpwordp
wordwordp
PointWise Mutual Information (PMI)and Semantic Orientation (SO)
PMI(word1, word2) =
SO (phrase) = PMI (phrase, ”excellent”) – PMI (phrase, “poor”)
Unsupervised Learning
Determine the Semantic Orientation (SO) of the phrases
Search on AltaVista
SO (phrase) =
)"(")""(
)"(")""(log
excellenthitspoorphraseNEARhits
poorhitsexcellentphraseNEARhits
Unsupervised Learning
Calculate average semantic orientation of document:
Extracted phrase
POS tags Semantic Orientation
Low fees JJ NN 0.333
Online service JJ NN 2.780
Inconveniently located
RB VB -1.541
Average Semantic Orientation = 0.524
Need for NLP
Identifying phrases is not enough – need to identify subject/object
- The NR70 is trendy. T-Series is fast becoming
obsolete.
Need to identify part-of and attribute-of relationship
- The battery is long-lasting
Focus of the sentiment
Feature/attribute terms:
BNP - Base Noun Phrases- battery, display, keypad
dBNP - Definite Base Noun Phrases- “the display”
bBNP - Beginning Definite Base Noun Phrases- “The battery is long-lasting”
Sentiment Analyzer
Sentiment lexicon database
- <lexical_entry> <POS> <sent_category>
- “excellent” JJ +
Sentiment pattern database
- <predicate> <sent_category> <target>
- “I am impressed with the flash capabilities”
- impress + PP(by;with) target
SA (contd.)
Identify sentences containing feature terms Ternary expressions (T-expressions)
- +ve/-ve sentiment verbs
<target, verb, “”>
- trans verbs
<target, verb, source> Binary expressions (B-expressions)
- <adjective, target>
SA (contd.)
Identify sentiment phrases within subject, object phrases
Associating sentiment with the target
- Based on sentiment patterns
“I was impressed by the flash capabilities”
“This camera takes excellent pictures”
- Based on B-expressions
“Poor performance in a dark room”
Other issues
Position of the sentiment words
- Words at the beginning and end of a review
Sentiment about the characters in the movie versus Sentiment about the actors in the movie – abstraction.
“He played the role of a very corrupt politician”
“He played the role brilliantly”
Conclusion
Sentiment detection can be used in areas ranging from marketing research to movie reviews.
Sentiment Detection is a “hard” problem due to context-sensitivity, complex sentences, etc.
Statistical methods should be augmented with NLP techniques.
References
Yi, Nasukawa, et al. Sentiment Analyzer: Extracting Sentiments about a Given Topic using NLP techniques. Proceedings of the Third IEEE International Conference on Data Mining, p. 427, Nov 19-22, 2003
Peter D. Turney. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. Proceedings of the 40th Annual Meeting of ACL, p. 417-424, 2002
Matthew Hurst and Kamal Nigam. Retrieving Topical Sentiments from Online Document Collections. Document Recognition and Retrieval XI, p. 27-34, 2004
References (contd.)
B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? Sentiment classification using Machine Learning techniques. Proceedings of the 2002 ACL EMNLP Conference, p. 79-86, 2002
Project
Sentiment analyzer for a specific domain Given set of features, initial list of polar words Learns new polar words from documents
analyzed