tweet recommendation with graph co-ranking
DESCRIPTION
TRANSCRIPT
Tweet Recommendation with
Graph Co-Ranking
Rui Yan, Mirella Lapata, Xiaoming Li
ACL 2012
Reader:
東京大学 相澤研究室
藤沼祥成
Motivation
• 3 problems related to tweet
recommendation
– Linkage of following and retweeting
– Interest the user
– Personalization and diversity
Related Work
• Collaborative Filtering [Hannon et al. 2010]
• Selecting tweets including URLs [Chen et
al. 2010]
– And so on…
• Co-Ranking Framework: Scientific impact
and modeling the relationship between
authors and their publications [Zhou et al.,
2007].
What is Proposed in this Paper
• Adapting Co-Ranking framework to Tweet
recommendation
• Including personalization
Graphs
Tweet Graph Author
Graph
Tweet-Author
Graph
Co-Ranking Algorithm
• Simultaneously rank tweets and their
authors
– a tweet is important if it associates to other
important tweets
– A user is important if the associate to other
important users, and they write important
tweets
Components of Co-Ranking
• Popularity (PageRank [Brin and Page 1998])
• Personalization (PersRank)
– Modifying PageRank
• Diversity (DivRank [Mei et al. 2010])
– Avoid assigning only high scores to closely
connected nodes
– Popular nodes get popular
Popularity: PageRank
• (1-μ): stick to the random walk
• μ: Jump to any vertex chosen uniformly at
random
• m: ranking scores of for the vertices in
Tweet graph
Personalization (1/2)
• Used Latent Dirichlet Allocation to construct
the matrix D
• Dij: Probabilitiy of tweet mi belongs to topic tj
• Image of D
𝐷11 ⋯ 𝐷1𝑛
⋮ ⋱ ⋮𝐷𝑚1 ⋯ 𝐷𝑚𝑛
To
pic
s
Tweets
Personalization (2/2)
• r: ri = the probability for a user to respond
to tweet mi
• Estimate t: topic interest vector by
maximum likelihood
Diversity: DivRank
• Transition probabilities change over time
• Favors popular nodes as time goes by
• After z iterations, M is
CoRank: Figure
Actual Steps
• Step 1
• Step 2
Walk from the author
Walk from the tweet
Ensuring convergence
Co-Ranking Algorithm
• Coupling parameter λ
• If λ=0, no coupling between Tweet graph
and Author graph
• In experiment, λ = 0.6
Transition Matrix in Author
Graph • It is defined as
Transition Matrix in Tweet
Graph • Tweet Graph is defined as
•
• mi a term vector is weighted as tf・idf
Transition Matrix in Tweet-
Author Graph • MU:
• UM:
• : tweet mi is authored by uj
Data Set
• 9,4 49,542 users
– Tracing the edges of 23 users’ followers and
followees until no new user is added
• 3/25/2011 to 5/30/2011
• 364,287,744 tweets
Evaluation
• Automatically
– Golden: A tweet is retweeted or not
• Human-based Judgement
– 23 users
– Whether they will retweet or not
– Calculating the mean
Baselines
• Randomly ranked (Random)
• Longer tweets ranked higher (Length)
• Many retweets ranked higher (RTnum)
• RankSVM algorithm (RSVM) [Duan et al.
2010]
• Decision Tree Classifier (DTC) [Uysal and
Croft 2011]
• Weighted Linear Combination (WLC)
[Huang et al. 2011]
Criteria
• Normalized Discounted Cumulative Gain
• Mean Average Precision
Normalized Discounted
Cumulative Gain • Highly relevant documents are more
valuable
• The lower the ranked position of the
relevant document is, the less valuable it
is for the user
Gradually reduces the
document score Normalized parameter
obtained from ideal
ranking
Normalized Discounted
Cumulative Gain
Gradually reduces the
document score Normalized parameter
obtained from ideal
ranking
Rank Tweet
1 A
2 B
3 C
4 D
5 E
6 F
AとFが共にリツイートされている時、Fが低くランク付けされている為、Fにペナルティを付ける
Mean Average Precision
• Average of the precision of top k
documents
Number of reposted
tweets
Precision at ith tweet
Retweeted or not
Mean Average Precision
Number of reposted
tweets Retweeted or not
Rank Tweet
1 A
2 B
3 C
4 D
5 E
6 F
If F is retweeted,
precision increases.
If not, precision
decreases
Results
• Automatic
Evaluation
• Manual
Evaluation
Up to top ranked 5
tweets
Evaluation of Components
• Automatic
Evaluation
• Manual
Evaluation
Conclusion
• Relatively improved 18.3% in DCG and
7.8% in MAP over the best baseline
• Improved due to using the tweets and their
authors
• Succeeded to recommend interesting
information that lies outside the user’s
followers
• Future: Include credibility and recency