click to add title a systematic framework for sentiment identification by modeling user social...

21
A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of Information and Decision Sciences University of Illinois at Chicago

Upload: domenic-caldwell

Post on 18-Jan-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

Click to Add Title

A Systematic Framework for Sentiment Identification by Modeling User Social Effects

Kunpeng ZhangAssistant Professor

Department of Information and Decision Sciences

University of Illinois at [email protected]

Page 2: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Agenda• Introduction• Problem statement• Methodology• Experiments and results• Conclusion and future work

Page 3: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Co-authors• Yi Yang, Ph.D. student at Northwestern

University

• Aaron Sun, Research Scientist, Samsung Research America

• Hengchang Liu, Assistant Professor at University of Science and Technology of China

Page 4: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Introduction• User generated content on social media

platforms• Data analysis for intelligent marketing

decisions• Voice of consumers

– Positive / negative aspects

Page 5: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Problem Statement• Given a sentence (usually, it is user-

generated content on social media platforms, such as comments on Facebook, tweets on Twitter, review on Amazon.com, etc.), we classify it into one of three categories:– Positive: directly or indirectly praise something, e.g.

“I love it! (^_^)”– Negative: directly or indirectly criticize something,

e.g. “We don’t like it at all. ”– Objective: No sentiments, or express a fact. e.g.

“Apple will release a new iPhone in next two months.”

Page 6: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Previous Work• Bag-of-word approaches

– Collecting keywords [5, 7, 21, 26]• Rule-based methods

– From the perspective of language characteristics [6, 22]

• Machine learning based methods – Sentence-level and document-level [7, 8, 10,

29]• However,

– None of them considers user social effects…

Page 7: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Methodology• Systematic

framework• Classification

problem• 4 major

features:– Peer influence– User preference– User profile– Textual

sentiment

Page 8: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Methodology 1 – User Preference (UserPref)• User preference can somehow reflects

user sentiments.• Item-based collaborative filtering on user-

item matrix– Row: user (millions)– Column: brand (thousands)– The element mij is 1 if user i “likes”

brand j, otherwise 0

m11, m12, …………,

m1n

m21, m22, …………,

m2n

……………

mm1, mm2, ……….., mmn

Note: “like” – like a brand on Facebook, following a brand on Twitter, give a high rating for a product on Amazon, etc.

Page 9: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Methodology 1 – User Preference (UserPref)• Two important issues using collaborative

filtering– Data sparsity

• Integrate multiple low-lever items into fewer high-lever items

– “Mac” and “iPhone” “Computer and Electronics”

– Similarity calculation and preference prediction• Which similarity measure is better?

– Cosine, Pearson correlation, Tanimoto correlation,log-likelihood based, Euclidean distance-based.

• Weighted sum strategy to approximate user preference

Page 10: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Methodology 2 – Peer Influence (PeerInf)• Herding behavior in social psychology.

– We assume that if most of previous comments in one discussion are positive, it is likely to give a positive comment, and similarly for the negative case.

– We randomly pick 1, 000 posts from 5 different Facebook pages and 1, 000 discussion threads from 5 different airlines on the Flyertalk.com forum. The average number of comments per post and per thread is 794 and 32, respectively.

– The sentiments are identified by the state-of-the-art textual algorithm.

Page 11: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Methodology 2 – Peer Influence

Page 12: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Methodology 2 – Peer Influence Modeling

Page 13: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Methodology 3 – User Profile (GenCat)• Female are more positive than male and

fashion page has a higher percentage of positive sentiments than politician page on Facebook and Twitter.Name (Topic) Gender Positive ratio Number of comments + tweets

Barack Obama (Politician)

M 0.61 6,837,096

F 0.69

Chicago Bulls (Sports)

M 0.68 462,092

F 0.79

DKNY (Fashion) M 0.94 14,284

F 0.96

Page 14: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Methodology 4 – Textual Sentiment (TextSent)• State-of-the-art textual sentiment

identification algorithm• Ensemble method integrating three

individual algorithms– Semantic rules based on language

characteristics– Numeric strength computing– Bag-of-word

• Accuracy: ~86%

Page 15: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Experiments and Results• Data collection

– Facebook: posts, comments, likes, user profile

– Twitter: tweets, follower, user profile– Amazon: product and reviews – Flyertalk (airline discussion forum):

discussions• Data cleaning

– Remove spam users

Page 16: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Experiments and Results• The features of learning model for 4

datasets and their differences. Topic is modified based on the raw Facebook category. “×”: missed; “√”: existing.

Data source

TextSent UserPref PeerInf

GenCatGender Topic

Facebook Comments User-post likes on category

√ √ Predefined category

Twitter Tweets User-category following

√ √ Predefined category

Amazon Product reviews

User-product rating √ × Product category

Flyertalk Airline discussions

× √ × Airline types

Page 17: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Experiments and Results• Similarity measure check.

– MAE and RMSE to compare the average estimated error between real preference and predicted preference

• Hadoop-based collaborative filtering implemented by Mahout.– Takes 34 and 21 minutes to approximate

user preferences for Facebook and Twitter

– Can NOT complete in 10 hours for single CPU.

Page 18: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Experiments and Results• Facebook

data• Twitter data• Amazon.com

data

Page 19: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Experiments and Results• Classification accuracy (SS: semantic +

syntactic features used in [28])

Page 20: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Conclusion and Future Work• We propose a systematic framework

to identify social media sentiments by modeling user social effects: user preference, peer influence, user profile, and textual sentiment itself.

• However,– More networked data could be

incorporated.– More efficient algorithms to calculate

user preference.

Page 21: Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of

A World-Class Education, A World-Class City

Thank you