ranking and diversity in recommendations - recsys stammtisch at soundcloud, berlin

RANKING AND DIVERSITY IN RECOMMENDATIONS Alexandros Karatzoglou @alexk_z

Thanks

Linas Baltrunas Telefonica

Saul Vargas Universidad Autonoma de Madrid

Yue Shi TU Delft

Pablo Castells Universidad Autonoma de Madrid

Telefonica Research in Barcelona • User Modeling: Recommender Systems • Data Mining, Machine Learning • Multimedia Indexing and Analysis • HCI • Mobile Computing • Systems and Networking

•  http://www.tid.es

Recommendations in Telefonica

People You May Know@Tuenti

Recommendations in Telefonica

Firefox OS Marketplace

Collaborative Filtering

+

+

+ +

+

+

+

+

+

fij = hUi,Mjifij = hUiMji+X

k2Fi

↵ik

|Fi|hUkMji

Tensor Factorization HOSVD CP-Decomposition

Fijk = S ⇥U Ui ⇥M Mj ⇥C CkFijk = hUi,Mj , CkiKaratzoglou et al. @ Recsys 2010

Publications in Ranking CIKM 2013: GAPfm: Optimal Top-N Recommendations for Graded Relevance Domains RecSys 2013: xCLiMF: Optimizing Expected Reciprocal Rank for Data with

Multiple Levels of Relevance RecSys 2012: CLiMF: Learning to Maximize Reciprocal Rank with Collaborative

Less-is-More Filtering * Best Paper Award SIGIR 2012: TFMAP: Optimizing MAP for Top-N Context-aware Recommendation Machine Learning Journal, 2008: Improving Maximum Margin Matrix Factorization

* Best Machine Learning Paper Award at ECML PKDD 2008 RecSys 2010: List-wise Learning to Rank with Matrix Factorization for Collaborative Filtering NIPS 2007: CoFiRank - Maximum Margin Matrix Factorization for Collaborative Ranking

Recommendations are ranked lists

Popular Ranking Methods •  In order to generate the ranked item list, we need some

relative utility score for each item •  Popularity is the obvious baseline •  Score could depend on the user (personalized) •  Score could also depend on the other items in the list (list-wise)

• One popular way to rank the items in RS is to sort the items according to the rating prediction •  Works for the domains with ratings •  Wastes the modeling power for the irrelevant items

Graphical Notation

Relevant

Irrelevant

Irrelevant

Irrelevant

Irrelevant

Irrelevant

Relevant

Relevant

Ranking using latent representation

•  If user = [-100, -100] •  2d latent factor •  We get the corresponding ranking

Matrix Factorization (for ranking) • Randomly initialize item vectors • Randomly initialize user vectors • While not converged

•  Compute rating prediction error •  Update user factors •  Update item factors

•  Lets say user is [-100, -100] •  Compute the square error

•  (5-<[-100, -100], [0.180, 0.19]>)2=1764 •  Update the user and item to the direction

where the error is reduced (according to the gradient of the loss)

8 items with ratings and random factors

Learning: Stochastic Gradient Descent with Square Error Loss

Square Loss User: [3, 1], RMSE=6.7

Learning to Rank for Top-‐k RecSys • Usually we care about accurate ranking and not ra=ng predic=on •  Square Error loss op=mizes to accurately predict 1s and 5s.

•  RS should get the top items right -‐> Ranking problem • Why not to learn how to rank directly?

•  Learning to Rank methods provide up to 30% performance improvements in off-‐line evalua=ons

•  It is possible, but a more complex task

Example: average precision (AP)

AP =

|S|X

k=1

P (k)

|S|

• AP: we compute the precision at each relevant position and average them

P@1+P@2+P@43

=1/1+ 2 / 2+3 / 4

3= 0.92

Why is hard? Non Smoothness Example: AP

u:[-20,-20]

u:[20,20]

AP vs RMSE

The Non-smoothness of Average Precision

APm =1

PNi=1 ymi

NX

i=1

ymi

rmi

NX

j=1

ymjI(rmj rmi)

AP =

|S|X

k=1

P (k)

|S|

ymi

rmi

I(·)is 1 if item i is relevant for user m and 0 otherwise

indicator function (1 if it is true, 0 otherwise)

Rank of item i for user m

How can we get a smooth-AP? • We replace non smooth parts of MAP with smooth approxima=on

g(x) = 1/(1 + e

�x)

1

rmi⇡ g(fmi) = g(hUm, Vii)

How can we get a smooth-MAP? • We replace non smooth parts of MAP with smooth approxima=on

g(x) = 1/(1 + e

�x)

I(rmj rmi) ⇡ g(fmj � fmi) = g(hUm, Vj � Vii)

u:[-20,-20]

u:[20,20]

Smooth version of MAP

Sometimes approximation is not very good…

Ranking Inconsistencies • Achieving a perfect ranking for all users is not possible

•  Two Sources of Inconsistencies:

•  1) Factor Models (all models) have limited expressive power and cannot learn the perfect ranking for all users

•  2) Ranking functions approximations are inconsistent e.g. A >B & B>C but C > A

Summary on Ranking 101

Area Under the ROC Curve (AUC)

AUC :=1

|S+||S�|

S+X

i

S�X

j

I(Ri < Rj)

Reciprocal Rank (RR)

RR :=1

Ri

Average Precision (AP)

AP =

|S|X

k=1

P (k)

|S|

AP vs RR

DCG =X

i

2score(i) � 1

log2(i+ 2)

Normalized Discounted Cumulative Gain (nDCG)

Relevance solved! Is that all?

•  Ranking “solves” the relevance problem

•  Can we be happy with the results?

Relevance solved! Is that all?

•  Coverage

•  Diversity

•  Novelty

•  Serendipity

Diversity in Recommendations

•  Diversity using Genres

•  Movies, Music, Books

•  Diversity should fulfill:

•  Coverage •  Redundancy •  Size Awareness

Diversity methods for RecSys

•  Topic List Diversification or •  Maximal Margin Relevance

fMMR(i;S) = (1� �) rel(i) + � minj2S

dist(i, j)

Diversity methods for RecSys

•  Intent-Aware IR metrics

ERR� IA =X

s

p(s)ERR

Example

Action Comedy Sci-Fi Western

Action Thriller Sci-Fi Western

Adventure Western

Genres 154

639

870

46

187146

1258

60

30

232

75202

171

8

21

37116

9

9

11

10

86

43

42

4

1

0

Action (517)

Comedy (1267)

Drama (1711)

Romance (583)

Thriller (600)

Genres and Popularity

Binomial Diversity

•  We base a new Diversity Metric on the Binomial Distribution

P (X = k) =

✓N

k

◆pk(1� p)N�k

User Genre Relevance

•  Fraction of Items of genre “g” user interacted with

•  Global fraction of items of genre “g”

•  Mix

p00g =kIug

|Iu|

p0g =

Pu k

IugP

u |Iu|

pg = (1� ↵) p0g + ↵ p00g

Coverage

Coverage(R) =Y

g/2G(R)

P (Xg = 0)1/|G|

•  Product of the probabilities of the genres not represented in the list not being picked by random

Non-Redundancy

Binomial Diversity

BinomDiv(R) = Coverage(R) ·NonRed(R)

Re-Ranking •  Re-Rank based on Relevance and Binomial Diversity

f

BinomDiv

(i;S) = (1� �) normrel

(rel(i)) + � norm

div

(div(i;S))

normX(x) =x� µX

�X

Example

Thanks! • Questions?

ranking and diversity in recommendations - recsys stammtisch at soundcloud, berlin

Technology