![Page 1: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/1.jpg)
1
“Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics”
Authors: Anindya Ghose, Panagiotis G. Ipeirotis, Member, IEEE
Course: Topics in Data miningPresenter: Nobal Niraula
December 8, 2010 @ UOM
![Page 2: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/2.jpg)
2
Introduction Gathering variables (Attributes) Explanatory study using Econometric
Regression◦ Hypothesis for sales◦ Hypothesis for perceived usefulness
Prediction◦ Helpfulness◦ Impact on sales
Conclusion
Outline
![Page 3: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/3.jpg)
3
Product related word-of-mouth conversations in online markets
Reviewers contribute time and energy Volume of review could be high Benefits
◦ Customers: Usefulness / Helpfulness Average Star Rating Bimodel Peer Review Biased Helpfulness = helpful votes / total votes “Spotlight Review” in Amazon.com
◦ Manufacturers: Influence on Sales Helpful reviews are not necessarily the ones that lead to increases
in sales ! Reviews that affect most should be presented first to
manufacturers
Introduction (1)
![Page 4: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/4.jpg)
4
The paper is unique in looking at how subjectivity level, readability and spelling errors in the text of reviews affect product sales and the perceived helpfulness of these reviews.
Introduction (2)
![Page 5: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/5.jpg)
5
Two Level Study◦ Explanatory Econometric Analysis
Identify aspects of a review a reviewer
◦ Prediction Model using “Random Forests” How peer consumers are going to rate a review How sales will be affected by the posted review
Predicting Helpfulness and Importance
![Page 6: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/6.jpg)
6
Product Reviews
![Page 7: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/7.jpg)
7
Sample Review
![Page 8: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/8.jpg)
8
Reviewer’s Profile
![Page 9: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/9.jpg)
9
Product Rank
![Page 10: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/10.jpg)
10
Variables Collection Products
◦ Audio and video players (144 products),◦ Digital cameras (109 products), and◦ DVDs (158 products).
Product and Sales Data :Retail Price, Sales Rank, Average Rating, Number of Reviews, Elapsed Date
Reviewer History: Number of Past Reviews, Reviewer History Micro, Reviewer History Micro, Past Helpful Votes, Past Total Votes
Reviewer Characteristics: Reviewer Rank, Top-10 Reviewer, Top-50 Reviewer, Top-100 Reviewer, Top-500 Reviewer, Real Name, Nick Name, Hobbies, Birthday, Location, Web page, Interests, Snippet, Any Discloser
Individual Review: Moderate Review, Helpful Votes, Total Votes, Helpfulness
Review Readability : Length(Chars), Length (Words), Length(Sentence), Spelling Error, ARI, Gunning Index, Coleman–Liau index, Flesch Reading Ease, Flesch–Kincaid Grade Level, SMOG
Review Subjectivity: AvgProb, DevProb
![Page 11: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/11.jpg)
11
Readability Analysis◦ Automated Readability Index◦ Coleman-Liau Index◦ Flesch-Kincaid Grade Level◦ Gunning fog index◦ SMOG
Subjectivity Analysis◦ Stylistic Choices : “Subjective” vs “Objective”◦ Each document gets a “Subjectivity Score”
AvgProb (r) : High value Many Subjective sentences DevProb (r) : High Value Mixed (Subj+Obj) sentences
Text of a Review Matters !
![Page 12: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/12.jpg)
12
Hypothesis 1a: ◦ All else equal, a change in the subjectivity level and
mixture of objective and subjective statements in reviews will be associated with a change in sales.
Hypothesis 1b: ◦ All else equal, a change in the readability score of
reviews will be associated with a change in sales.
Hypothesis 1c: ◦ All else equal, a decrease in the proportion of spelling
errors in reviews will be positively related to sales.
Hypothesis for Sales
![Page 13: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/13.jpg)
13
ln(D) = a + b * ln(S)◦ D is the unobserved product demand◦ S is its observed sales rank◦ Pareto Distribution◦ High sales rank low demand
Key Observation:◦ “Sales rank” in Amazon.com can be taken as
PROXY of Demand !
Effect on Product Sales
![Page 14: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/14.jpg)
14
Descriptive Statistics for Econometric Analysis
![Page 15: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/15.jpg)
15
Model to test Hypothesis1
μk is a product fixed effect that accounts for unobserved heterogeneity across products and εkt is the error termControl Variables: Retail Price, Avg. Numeric Rating, Elapsed Date, Number of Reviews
![Page 16: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/16.jpg)
16
Empirical Results for Product Sales
Note: 1. (-ve) decrease Sales Rank Increase Sales2. Variables that Increase Sales: AvgProb, Readability, Spelling
Errors3. Variables that Decrease Sales: Retail Price, DevProbAlso: Reviews with Rating < =2 are associated with increased sales
![Page 17: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/17.jpg)
17
Hypothesis 1a:◦ High subjective sentences increase sales◦ Mixture of subjective and objective sentences are negatively associated
with product sales compared to highly subjective and objective sentences.
Hypothesis 1b:◦ Higher readability scores are associated with higher sales
Hypothesis 1c◦ An increase in proportion of spelling mistakes decreases product sales for
some “experience products” like DVDs however the proportion of spelling errors doesn’t have significant impact on sales for “search products”
Reviews with that rate products negatively can be associated with increased product sales when the review text is informative and detailed !!!
Conclusion (1)
![Page 18: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/18.jpg)
18
Hypothesis 2a: ◦ All else equal, a change in the subjectivity level and mixture of
objective and subjective statements in a review will be associated with a change in the perceived helpfulness of that review
Hypothesis 2b: ◦ All else equal, a change in the readability of a review will be
associated with a change the perceived helpfulness of that review.
Hypothesis 2c: ◦ All else equal, a decrease in the proportion of spelling errors in a
review will be positively related to perceived helpfulness of that review.
Hypothesis 2d: ◦ All else equal, an increase in the average helpfulness of a
reviewer’s historical reviews will be positively related to perceived helpfulness of a review posted by that reviewer.
Hypothesis for Helpfulness
![Page 19: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/19.jpg)
19
Effect on Helpfulness
μk is a product fixed effect that controls differences in the average helpfulness of reviews across products and εkt is the error term
![Page 20: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/20.jpg)
20
Empirical Results for Helpfulness
Note:
(-ve) Lower Helpfulness
Negative Relations:AvgProb, Spelling Error, Moderate
Positive Relations: DevProb **, Disclosure, Readability, Reviewer History Macro, Number of Reviews
![Page 21: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/21.jpg)
21
Hypothesis 2a:◦ In general, mixture of subjective and objective elements more
informative (helpful) by the users.◦ For feature-based goods users prefer reviews having more objective
information and less subjective sentences ◦ For experience goods, e.g. DVD, users expect few objective
sentences but more subjective sentences
Hypothesis 2b – 2d : ◦ Increase in the readability of reviews has a positive and statistically
impact on review helpfulness◦ An increase in proportion of spelling errors has a negative and
statistically significant impact review helpfulness for audio-video products and DVDs.
◦ Past historical information about reviewers has a statistically significant effect on the perceived helpfulness of reviews
Conclusion (2)
![Page 22: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/22.jpg)
22
Main goal◦ Is the review informative or not ?◦ Does the review impact on sales or not ?
Question: given a helpfulness value of a review, decide whether it is useful or not◦ Helpfulness = (Helpful votes/ Total votes)◦ Continuous to binary conversion◦ Threshold found is 60 %
Classification◦ Regression Model can be used◦ Binary Classification
Predictive Modeling
![Page 23: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/23.jpg)
23
Classifiers◦ SVM VS Random Forest
SVM consistently performed worse unlike reported in reports
Training time for SVM was significantly higher than that of Random Forest
Predicting Helpfulness (1)
![Page 24: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/24.jpg)
24
Predicting Helpfulness (2)
![Page 25: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/25.jpg)
25
Examining whether the difference SalesRankt(r)+T − SalesRankt(r) where t(r) is the time the review is posted, is positive or negative.
Predicting Impact on Sales
![Page 26: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/26.jpg)
26
Random Forest based prediction◦ For experience goods such as DVDs classifier has
lower performance◦ Observed high correlation of “classification error”
with “distribution of review ratings”◦ Reviews that have received widely fluctuating
ratings also have reviews with widely fluctuating helpfulness votes.
◦ Highly detailed and readable reviews can have low helpfulness votes
◦ “reviewer-related”, “review subjectivity” and “review readability” features sets are interchangeable!
Conclusion (3)
![Page 27: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/27.jpg)
27
Subjectivity level, readability and spelling errors in the text of reviews affect product sales and the perceived helpfulness
Overall Conclusion
![Page 28: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/28.jpg)
28
Anindya Ghose, Panagiotis G. Ipeirotis, "Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics," IEEE Transactions on Knowledge and Data Engineering, vol. 99, no. PrePrints, , 2010
References
![Page 29: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics](https://reader036.vdocuments.net/reader036/viewer/2022081519/5562cca8d8b42a63498b4687/html5/thumbnails/29.jpg)
29
Thank You !