a study of smoothing methods for relevance-based language modelling of recommender systems [ecir...

A Study of Smoothing Methods for Relevance-BasedLanguage Modelling of Recommender SystemsDaniel Valcarce, Javier Parapar, Álvaro Barreiro{daniel.valcarce, javierparapar, barreiro}@udc.es – http://www.irlab.orgInformation Retrieval Lab, Computer Science Department, University of A Coruña

OverviewLanguage Models have been traditionally used in several fields such as speech recognition or document retrieval. Recently,Relevance-Based Language Models have been extended to Collaborative Filtering Recommender Systems [1]. Inthis field, a Relevance Model is estimated for each user based on the probabilities of the items. As it was thoroughly studied,smoothing plays a key role in the estimation of a Language Model [2]. Our aim in this work is to study smoothing methodsin the context of Collaborative Filtering Recommender Systems.

RM for RecommendationIR RecSys

Query Target userDocument Neighbour

Term Item

RM1 : p(i|Ru) ∝∑v∈Vu

p(v)p(i|v)∏j∈Iu

p(j|v) (1)

RM2 : p(i|Ru) ∝ p(i)∏j∈Iu

∑v∈Vu

p(i|v)p(v)p(i)

p(j|v) (2)

• Iu is the set of items rated by the user u

• Vu is the set of neighbours of the user u

• p(i) and p(v) are considered uniform

• p(i|u) is computed smoothing pml(i|u) =ru,i∑j∈Iu ru,j

Smoothing methodsSmoothing deals with data sparsity and plays a similar role to

the IDF using a background model: p(i|C) =∑

v∈U rv,i∑j∈I, v∈U rv,j

.

Jelinek-Mercer (JM) Linear interpolation. Parameter λ.

pλ(i|u) = (1− λ) pml(i|u) + λ p(i|C) (3)

Dirichlet Priors (DP) Bayesian analysis. Parameter µ.

pµ(i|u) =ru,i + µ p(i|C)µ+

∑j∈Iu ru,j

(4)

Absolute Discounting (AD) Subtract a constant δ.

pδ(i|u) =max(ru,i − δ, 0) + δ |Iu| p(i|C)∑

j∈Iu ru,j(5)

Experiments

0

0.05

0.1

0 100 200 300 400 500 600 700 800 900 1000

µ

RM1 + ADRM1 + JMRM1 + DP

RM2 + ADRM2 + JMRM2 + DP

0.25

0.3

0.350 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

P@

5

λ / δ

0.1 0.2

0.3 0.4

0.5 0.6

0.7 0.8

0.9 1

0 100

200 300

400 500

600 700

800 900

1000 0

0.2

0.4

0.6

0.8

1

P@

5

δ

#ratings

P@

5

0

0.2

0.4

0.6

0.8

1

Precision at 5 of the RM1 and the RM2 algorithms using Abso-lute Discounting (AD), Jelinek-Mercer (JM) and Dirichlet pri-ors (DP) smoothing methods for the MovieLens 100k dataset.

Precision at 5 of the RM2 algorithm using AD when varyingthe smoothing intensity and considering different number ofratings in the user profiles for the MovieLens 1M dataset.

Conclusions• There are no big differences in terms of optimal pre-cision among the studied smoothing techniques.• Dirichlet priors and, specially, Jelinek-Mercer suffer a sig-nificant decrease in precision when a high amountof smoothing is applied.• Absolute Discounting behaves almost as aparameter-free smoothing method.

Bibliography[1] J. Parapar, A. Bellogín, P. Castells, and A. Barreiro.

Relevance-based language modelling for recommender sys-tems. IPM, 49(4):966–980, July 2013.

[2] C. Zhai and J. Lafferty. A study of smoothing methodsfor language models applied to information retrieval. ACMTOIS, 22(2):179–214, Apr. 2004.

ECIR 2015, 37th European Conference on Information Retrieval. 29 March - 2 April, 2015, Vienna, Austria

a study of smoothing methods for relevance-based language modelling of recommender systems [ecir...

Data & Analytics