bayesian personalized ranking for non-uniformly sampled items

19
Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Bayesian Personalized Ranking for Non-Uniformly Sampled Items Zeno Gantner, Lucas Drumond, Christoph Freudenthaler, Lars Schmidt-Thieme University of Hildesheim 21 August 2011 Zeno Gantner et al., University of Hildesheim 1 / 15

Upload: zeno-gantner

Post on 21-May-2015

1.662 views

Category:

Technology


2 download

DESCRIPTION

The slide set describing our approach to the KDD Cup 2011, presented at the KDD Cup workshop in San Diego, California.

TRANSCRIPT

Page 1: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items

Zeno Gantner, Lucas Drumond, Christoph Freudenthaler,Lars Schmidt-Thieme

University of Hildesheim

21 August 2011

Zeno Gantner et al., University of Hildesheim 1 / 15

Page 2: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Questions (and Answers)

Who? Which?

How?

Why?Where?

What?

Zeno Gantner et al., University of Hildesheim 2 / 15

Page 3: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Which problem to solve?

Which problem to solve?

Rating Prediction (Track 1)

vs.

Item Prediction (Track 2)

Zeno Gantner et al., University of Hildesheim 3 / 15

Page 4: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items How did we tackle the problem?

How did we tackle the problem?Bayesian Personalized Ranking:

BPR(DS) = argmaxΘ

∑(u,i ,j)∈DS

ln σ(su,i (Θ)− su,j (Θ) )−λ‖Θ‖2

I DS contains all pairs of positive and negative items for each user,

I σ(x) = 11+e−x is the logistic function,

I Θ represents the model parameters,

I su,i (Θ) is the predicted score for user u and item i , and

I λ‖Θ‖2 is a regularization term to prevent overfitting.

interpretation 1: reduce ranking to pairwise classif. [Balcan et al. 2008]

interpretation 2: optimize for smoothed area under the ROC curve (AUC)

Model: matrix factorizationLearning: stochastic gradient ascent

[Rendle et al., UAI 2009]Zeno Gantner et al., University of Hildesheim 4 / 15

Page 5: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items How did we tackle the problem?

How did we tackle the problem?

BPR(DS) = argmaxΘ

∑(u,i ,j)∈DS

ln σ(su,i − su,j)− λ‖Θ‖2

problem: all negative items j are given the same weight

solution: adapt weights in the optimization criterion (and samplingprobabilities in the learning algorithm)

WBPR(DS) = argmaxΘ

∑(u,i ,j)∈DS

wuwiwj ln σ(su,i − su,j)− λ‖Θ‖2,

wherewj =

∑u∈U

δ(j ∈ I+u ). (1)

Zeno Gantner et al., University of Hildesheim 5 / 15

Page 6: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items How did we tackle the problem?

How did we tackle the problem?

BPR(DS) = argmaxΘ

∑(u,i ,j)∈DS

ln σ(su,i − su,j)− λ‖Θ‖2

problem: all negative items j are given the same weight

solution: adapt weights in the optimization criterion (and samplingprobabilities in the learning algorithm)

WBPR(DS) = argmaxΘ

∑(u,i ,j)∈DS

wuwiwj ln σ(su,i − su,j)− λ‖Θ‖2,

wherewj =

∑u∈U

δ(j ∈ I+u ). (1)

Zeno Gantner et al., University of Hildesheim 5 / 15

Page 7: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Why did we not win?But also: Why did we perform better than others?

Why did we perform better than others?

I straightforward model that matches the prediction task pretty well

I scalability (e.g. k = 480 factors per user/item)

I integration of rating information (see paper)

I ensembles (see paper)

Why did we not win?

I . . . two possible answers . . .

Zeno Gantner et al., University of Hildesheim 6 / 15

Page 8: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Taxonomy

Zeno Gantner et al., University of Hildesheim 7 / 15

Page 9: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Learn the right contrast

rating >= 80

rating < 80

no rating

liked?

rating >= 80

rating < 80

no ratingrated?

rating >= 80 no rating?

Zeno Gantner et al., University of Hildesheim 8 / 15

Page 10: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Learn the right contrast

rating >= 80

rating < 80

no rating

liked?

rating >= 80

rating < 80

no ratingrated?

rating >= 80 no rating?

Zeno Gantner et al., University of Hildesheim 9 / 15

Page 11: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Learn the right contrast

rating >= 80

rating < 80

no rating

liked?

rating >= 80

rating < 80

no ratingrated?

rating >= 80 no rating?

Zeno Gantner et al., University of Hildesheim 10 / 15

Page 12: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Why did we not win?

Learn the right contrast

rating >= 80

rating < 80

no rating

liked?

rating >= 80

rating < 80

no ratingrated?

rating >= 80 no rating?

Zeno Gantner et al., University of Hildesheim 11 / 15

Page 13: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Where?

Where next?

I classification → ranking → pairwise classification

I pairwise classification: try other losses, e.g. soft margin (hinge) loss

I Bayesian2 Personalized Ranking

I beyond KDD Cup: consider different sampling schemes . . .

Zeno Gantner et al., University of Hildesheim 12 / 15

Page 14: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Summary

Summary

I Use matrix factorization optimized for BayesianPersonalized Ranking (BPR) to solve the itemranking problem.

I BPR reduces ranking (in this case: binaryvariables) to pairwise classification.

I Extend BPR to use different sampling scheme:Weighted BPR (WBPR).

I Open question: Learn a different contrast?

I Details can be found in the paper.

I Code: http://ismll.de/mymedialite/

examples/kddcup2011.html

advertisement: Contribute to http://recsyswiki.com!

Zeno Gantner et al., University of Hildesheim 13 / 15

Page 15: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Questions

Zeno Gantner et al., University of Hildesheim 14 / 15

Page 16: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items

AcknowledgementsThank you

I The organizers, for hosting a great competition.

I The participants, for sharing their insights.

Funding

I German Research Council (Deutsche Forschungsgemeinschaft, DFG) projectMultirelational Factorization Models.

I Development of the MyMediaLite software was co-funded by the EuropeanCommission FP7 project MyMedia under the grant agreement no. 215006.

Picture credits

I by Michael Sauers, under Creative Commons by-nc-sa 2.0http://www.flickr.com/photos/travelinlibrarian/223839049/

I by Rob Starling, under Creative Commons by-sa 2.0http://en.wikipedia.org/wiki/File:Air_New_Zealand_B747-400_ZK-SUI_at_LHR.jpg

Zeno Gantner et al., University of Hildesheim 15 / 15

Page 17: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items

Numbers?

k error in %“liked” contrast

320 5.52480 5.08

“rated” contrast

320 5.15480 4.87

Estimated error on validation split (not leaderboard).

Zeno Gantner et al., University of Hildesheim 16 / 15

Page 18: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Advertisement

MyMediaLite: Recommender System Algorithm Libraryfunctionality

I rating prediction

I item recommendation from implicit feedback

I group recommendation

target groups

I researchers, educators and students

I application developers

development

I written in C#, runs on Mono

I GNU General Public License (GPL)

I regular releases (ca. 1 per month)

I simple

I free

I scalable

I well-documented

I well-tested

I choice

http://ismll.de/mymedialite

Zeno Gantner et al., University of Hildesheim 17 / 15

Page 19: Bayesian Personalized Ranking for Non-Uniformly Sampled Items

Bayesian Personalized Rankingfor Non-Uniformly Sampled Items Advertisement

RecSys Wiki is looking for contributions

Alan

Zeno

http://recsyswiki.com

Zeno Gantner et al., University of Hildesheim 18 / 15