an effective statistical approach to blog post opinion retrieval ben he craig macdonald iadh ounis...
TRANSCRIPT
AN EFFECTIVE STATISTICAL APPROACH TO BLOG POST OPINION RETRIEVAL
Ben He
Craig Macdonald
Iadh Ounis
University of Glasgow
Jiyin HeUniversity of Amsterdam
CIKM 2008
1
Introduction
Finding opinionated blog posts is still an open problem.
A popular solution is to utilize the external resources and manual efforts in identifying subjective features.
The authors proposed a dictionary-based statistical approach, which automatically derives evidence for subjectivity from the blog collection itself, without requiring any manual effort.
2
TREC Opinion Finding Task (1/2) Text REtrieval Conference. Goal: To identify sentiment at the
document-level. The dataset are composed of:
Feed documents: XML format, usually a short summary of the blog post.
Permalink documents: HTML format, the complete blog post and its comments.
Homepage documents: HTML format, main entry to the blog.
3
TREC Opinion Finding Task (2/2) Sample query format:
<top><num> 863<title> netflix<desc>
Identify documents that show customer opinions of Netflix.
<narr>A relevant document will indicate subscriber satisfaction with Netflix. Opinions about the Netflix DVD allocation system, promptness or delay in mailings are relevant.Indications of having been or intent to become a Netflix subscriber that do not state an opinion are not relevant.
</top>
4
Statistical Dictionary-based Approach
5
Dictionary Generation
The Skewed Query Model Rank all terms in the collection by term
frequencies in descending order. The terms, whose rankings are in the range
(S·#terms, U·#terms)are selected in the dictionary. #terms : the number of unique terms in the
collection S,U : model parameters. S=0.00007 and
U=0.001 in this paper.
6
Dictionary Generation
Ex:#terms=200,000#terms x 0.00007=14#terms x 0.001=200Only those terms ranked 14 to 200 will be
preserved
The dictionary is not necessary opinionated.
7
Term Weighting (1/2)
KL divergence method
D(Rel): Collection of relevant documents. D(opRel): Collection of opinionated and relevant documents. c(D(opRel))= #tokens in the opinionated documents. c(D(Rel))= #tokens in the relevant documents. tfx=term frequency of the term t in the opinionated
documents. tfrel=term frequency of the term t in the relevant
documents.
8
Term Weighting (2/2)
Bose-Einstein statistics method Measures how informative a term is in the
set D(opRel) against D(Rel).
= : the frequency of the term t in the D(Rel). : the number of documents in D(Rel). : the frequency of the term t in the
D(opRel).
9
Generating the Opinion Score Take the X top weighted terms from the
opinion dictionary. X will be tuned in the training step.
Submit them to the retrieval system as a query Qopn.
Score(d,Qopn): the opinion score of document d.
Score(d,Q): the initial ranking score.
10
Score Combination
Linear combination:
Log combination:
a, k will be tuned in the training step.
11
Experiment Settings (1/3)
TREC06: 50 topics for training. TREC07: 50 topics for testing. Only the “title” field is used (1.74
words/topic). Baseline 1: Apply InLB model, a variation of
the BM25 ranking function. Retrieve as many relevant documents as possible.
12
Experiment Settings (2/3)
Baseline 2: favor documents where the query terms appear in close proximity.
Q2: The set of all query term pairs in query Q. N: #Docs in the collection. T: #Tokens in the collection. pfn: The normalized frequency of the tuple p.
13
Experiment Settings (3/3)
Manually collecting an external dictionary from OpinionFinder and several other resources.
Contains approximately 12,000 English words, mostly adjectives, adverbs and nouns.
14
Experiment: Term Weighting (1/2) Hypothesis: the most opinionated terms
for one query set are also good indicators of opinion for other queries.
Sampling:
For each sample set, calculate the weight of each terms.
Training Set(50 Topics)
Set1
Set2
Set10…
Each with 25 Topics
Overlap :65%
maximum
15
Experiment: Term Weighting (2/2) Compute the cosine similarity between
the weights of the top 100 weighted terms from each two samples
16
Experiment: Validation (1/3)
Tuning the parameters X, a and k mentioned before.
Maximize X by maximizing the mean MAP of the 10 samples.
17
Training Set(50 Topics)
Set1for
assigning term weight
Set1’for
validation
Experiment: Validation (2/3)18
Experiment: Validation (3/3)
Fix X=100, tuning a and k. a within [0, 1] , step=0.05 k within (0, 1000] , step=50
19
Experiment: Evaluation (1/3)
20
Experiment: Evaluation (2/3)
21
Experiment: Evaluation (3/3) Comparison with the OpinionFinder
All being equal, replace the opinion score Score(d,Qopn) with
22
Conclusion
An effective and practical approach to retrieving opinionated blog posts without manual effort.
Opinion scores are computed during indexing Computational cost is negligible.
The automatically generated internal dictionary performs as good as the external dictionary.
Diferrent random samples from the collection reach a high consensus on the opinionated terms if the Bose-Einstein statistics given by the geometric distribution are applied.
23
Thank you for listening!24