a review of sentiment analysis approaches in big

13
A Review of Sentiment Analysis Approaches in Big Data Era Nurfadhlina Mohd Sharef Department of Computer Science Faculty of Computer Science and Information Technology, Universiti Putra Malaysia Serdang, Selangor, Malaysia [email protected]

Upload: nurfadhlina-mohd-sharef

Post on 18-Dec-2014

83 views

Category:

Internet


4 download

DESCRIPTION

The big data phenomenon has confirmed the achievement of data access transformation. Sentiment analysis (SA) is one of the most exploited area and used for profit-making purpose through business intelligence applications. This paper reviews the trends in SA and relates the growth in the area with the big data era.

TRANSCRIPT

Page 1: A review of sentiment analysis approaches in big

A Review of Sentiment Analysis Approaches in Big Data Era

Nurfadhlina Mohd SharefDepartment of Computer Science

Faculty of Computer Science and Information Technology, Universiti Putra MalaysiaSerdang, Selangor, Malaysia

[email protected]

Page 2: A review of sentiment analysis approaches in big
Page 3: A review of sentiment analysis approaches in big
Page 4: A review of sentiment analysis approaches in big

Sentiment Analysis

analyzes people’s sentiments, opinions, appraisals, attitudes, evaluations, and emotions

towards entities such as organizations, products, services, individuals, topics, issues, events, and their

attributes

as presented online via text, video and other means of communication.

Page 5: A review of sentiment analysis approaches in big

Sentiment Analysis

These communications can fall into three broad categories: positive, neutral or negative.

There are also many names and slightly different tasks, e.g., sentiment analysis, opinion mining, opinion extraction, sentiment mining, subjectivity analysis, customer complaint, affect analysis, emotion analysis, review mining, review analysis, etc.

Page 6: A review of sentiment analysis approaches in big
Page 7: A review of sentiment analysis approaches in big
Page 8: A review of sentiment analysis approaches in big

Tools DescriptionThe Hadoop Distributed File System

(HDFS)

HDFS divides the data into smaller parts and distributes it across the various

servers/nodes

SQL Server Integration

Service These tools allow posts can be downloaded and loaded into Hadoop

Apache Flume

MapReduceMapReduce is a process that transforms

data loaded into Hadoop into a format that can be used for analysis.

Hivea runtime Hadoop support architecture that leverages Structure Query Language (SQL)

with the Hadoop platform.

Jaql Jaql converts high-level queries into low-level queries and

Zookeeper Zookeeper coordinate parallel processing across big clusters

HBase HBase is a column-oriented database management system that sits on top of

HDFS by using a non-SQL approach.

Page 9: A review of sentiment analysis approaches in big

Problem

Which features to use?

Words (unigrams)

Phrases/n-grams

Sentences

How to interpret features for sentiment detection?

Bag of words (IR)

Annotated lexicons (WordNet, SentiWordNet)

Syntactic patterns

Paragraph structure

Page 10: A review of sentiment analysis approaches in big

Challenges

Harder than topical classification, with which bag of words features perform well

Must consider other features due to…

Subtlety of sentiment expression

irony

expression of sentiment using neutral words

Domain/context dependence

words/phrases can mean different things in different contexts and domains

Effect of syntax on semantics

Page 11: A review of sentiment analysis approaches in big

Sentiment Analysis TrendsYear Quantit

yHighlighted Topics

2004 4 Affective computing, sentiment classification, polarity2005 10 Contextual polarity, phrase level SA, sentiment classification, scores, subject classification2006 10 Lexicon, feature, summarization, mining, understanding, temporal SA, weighted polarity, user profiling

based on SA2007 43 Lexicon, feature mining, emotion detection, clustering, conjuncts presence2008 72 Multi-lingual SA, ratings inference, feature mining, word orientation, SentiWordNet, rating weighting,

radicalization detection, affective computing, compositional semantics analysis, sentiment-based prediction, concept hierarchy, classification

2009 131 ML approaches for SA, user profiling based on SA, feature association, semantic association, visual SA, cross-linguistic SA, ontology-based SA, polarity lexicon, multi-entity scoring, affective computing

2010 216 Orientation analysis, affective computing, linguistic models, applied visual for SA, semantic role labeling, clustering-based SA, cross-lingual SA, SA-based prediction, twitter-based SA, global SentiWordNet, intensity classification, cross-domain SA, opinion question- answering, sentiment topic detection, language specific SA

2011 297 Opinion leader identification, social network-based surveillance, product recommendation, terrorism informatics, affective computing, features clustering, political orientation detection, wish identification, sentiment lexicon, influence detection, personality mining, polarity analysis, graph based sentiment representation, semantic based SA, learning models for SA, emotion clustering, ontology based SA, sentence level SA, language specific SA

2012 454 Linguistic features analysis, business and financial forecasting, attitude prediction, sentiment topic detection, verbs polarity disambiguation, SenticNet, semantic orientation, language specific SA, cross lingual SA, emotion recognition, social values and group identification,

2013 562 Multilingual, ML-based polarity detection, sentiment evolution modeling, aspect-based sentiment classification, social intelligence, SA-based prediction, computational analysis of public voice, emotion mining, SA-based customer care, security-related intelligence, graph extraction, social network-based SA, linguistic features, statistical approaches for SA, concept-level SA, correlational study between financial sentiment and prices in financial markets, subjectivity detection, cross-domain SA, opinion leaders identification,

2014 216 Feature-based SA through ontologies, concept-level SA based on dependency rules, word polarity disambiguation, aspect-oriented SA, sentence-level SA, graph clustering for SA, subjectivity analysis, word sentiment in WordNet 3.0, computational analysis of public voice

Text mining

techniques

multilingual

linguistics

applied

linguistics

Page 12: A review of sentiment analysis approaches in big

Approaches

Sentiment Analysis

Content-based

Polarity Detection Positive, Negative, Neutral

Strength Detection

Typically [-1,1]

SentiWordNet

Feature Mining

Unigram, Bigram

Syntactic, Lexical, Structural

Link-based

Stylistic

Affective Computation

Emotion Classification

Social Network

Influencer

Multilingual

Machine Learning

Naïve Bayes

Support Vector Model

Page 13: A review of sentiment analysis approaches in big

Conclusion

This paper has discussed the trends in SA

the climax of big data era has gained even more focus even the area has been started since before year 2004. Advancements in big data technologies have also enabled this area to flourish.

Nevertheless, many rooms of improvements exist such as maturing the big data technologies and increasing alternatives for SA solutions using the platform.

More infrastructures are also needed to let SA to be exploited for many more applications besides the existing community centric, product review-based and influential assessment.

Studies for techniques of SA in cross-domain dataset and multilingual should also explored.

Improvements for deeper semantic computation such as the SenticNet approach should also be expanded besides enriching SentiWordNet for multilingual, more precise and multi-granular representation