recsplorer: recommendation algorithms based on precedence mining aditya parameswaran stanford...
Post on 19-Dec-2015
216 views
TRANSCRIPT
Recsplorer: Recommendation Algorithms Based on Precedence Mining
Aditya ParameswaranStanford University
(Joint work with G. Koutrika, B. Bercovitz & H. Garcia-Molina)
1
Applications (Far too many!)
2
What’s New?
Collaborative filtering
Extracting patterns ~10 yrs But not used in recommendations!Challenge: Aggregation & Sparsity
Sets, not Sequences
Won’t need ratings!
Lack of “similar people”
3
Motivating Example
q1 q2 q3 q4A : 5 B : 5 D : 5 -A : 1 E : 2 D : 4 F : 3G : 4 H : 2 E : 3 F : 3B : 2 G : 4 H : 4 E : 4A : 5 G : 4 E : 4 -
Useru1u2u3u4u
Target user
G : 4 E : 3H : 2H : 4G : 4 E : 4
G : 4 E : 4
H : 2H : 4
H : 3
Ignore potentially useful information
Exploit patterns only among similar users
Sparsity of ratings,Few recommendationsRecommend
4
Motivating Example (contd.)
q1 q2 q3 q4A : 5 B : 5 D : 5 -A : 1 E : 2 D : 4 F : 3G : 4 H : 2 E : 3 F : 3B : 2 G : 4 H : 4 E : 4A : 5 G : 4 E : 4 -
Useru1u2u3u4u
Target user
A : 5 D : 5A : 1 D : 4
A DA D
A D
E : 2 F : 3E : 3 F : 3
E FG H
DFH
Recommend
Mine a larger portion of user histories
Exploit patterns across all users
More and better recommendations
User preferences, logical orders, interest evolution
H : 2G : 4
H : 4G : 4G : 4
How to assign scores?
5
GoalsQuality of recommendations
Not enough!
Coverage
Goodness
Unexpectedness
Predictability
Not covered in this talk
Efficiency
6
Precedence Model
A prediction problem using conditional probabilities
Given A, what is the probability that X will follow P[ X | A ]
Incorrect!Contains
-AX
A XX A
User Histu1u2u3u4u5
P[ X | A ] = 1/3
P[ X | A with no X preceding ] = 1/2
P[X |AX]7
Algorithm 1: Single Item Max-Confidence
Current user’s history UD1 D2 D3 Dm…
X
sup(Di, X) θ
P[X|DmX]
score(X) = maxi P[X | DiX]
8
Current user’s history UD1 D2 D3 Dm…
X
score(X) = P[X | UX]
Algorithm 2: Joint Probabilities
9
score(X) = P[X | UX] Current user’s history : U = {D1, D2, … Dm}
Approximating:
score(X) = P[X | D1X D2
X … DmX]
score(X) P[X] × Π P[DiX | X]
Di in U
Algorithm 2: Joint Probabilities (Contd.)
10
Current user’s history UD1 D2 D3 Dm…
X
score(X) P[X] × Π P[DiX | X]
Di in UTop Di in U
Algorithm 3: Hybrid
11
Evaluation: Methodology
Dataset: 7,500 Student transcripts from CourseRank
Evaluation Methodology:
Input: x Hidden: r
Metrics:precision@k = fraction of top-k recommendations in r
coverage@k = number of users for whom an algorithm generates at least k recommendations
System: CourseRank (an educational social site for Stanford)
12
Evaluation: Algorithms
Popularity
Reranked
Hybrid
Joint Probabilities
Single Item Max Confidence
Collaborative Filtering
Not covered in this talk
Joint Probabilities with Support
13
EvaluationSupport θ =30, I =3 samples, x=14
14
EvaluationSupport θ =30, I =3 samples, k=2 recommendations
15
EvaluationSupport θ =30, I =3 samples, k=10 recommendations
16
Summary of Contributions
Finer-grained precedence model to leverage collective wisdom
Higher coverage + precision@k
More in paper: • other algorithms• goodness / unexpectedness• optimal thresholds • user study
17