Poster for the ECIR 2015 short paper:
Daniel Valcarce, Javier Parapar, Alvaro Barreiro: A Study of Smoothing Methods for Relevance-Based Language Modelling of Recommender Systems. ECIR 2015: 346-351
http://dx.doi.org/10.1007/978-3-319-16354-3_38
Anomaly detection and data imputation within time series
A Study of Smoothing Methods for Relevance-Based Language Modelling of Recommender Systems [ECIR '15 SP Poster]
1. A Study of Smoothing Methods for Relevance-Based
Language Modelling of Recommender Systems
Daniel Valcarce, Javier Parapar, Álvaro Barreiro
{daniel.valcarce, javierparapar, barreiro}@udc.es – http://www.irlab.org
Information Retrieval Lab, Computer Science Department, University of A Coruña
Overview
Language Models have been traditionally used in several fields such as speech recognition or document retrieval. Recently,
Relevance-Based Language Models have been extended to Collaborative Filtering Recommender Systems [1]. In
this field, a Relevance Model is estimated for each user based on the probabilities of the items. As it was thoroughly studied,
smoothing plays a key role in the estimation of a Language Model [2]. Our aim in this work is to study smoothing methods
in the context of Collaborative Filtering Recommender Systems.
RM for Recommendation
IR RecSys
Query Target user
Document Neighbour
Term Item
RM1 : p(i|Ru) ∝
v∈Vu
p(v)p(i|v)
j∈Iu
p(j|v) (1)
RM2 : p(i|Ru) ∝ p(i)
j∈Iu v∈Vu
p(i|v)p(v)
p(i)
p(j|v) (2)
• Iu is the set of items rated by the user u
• Vu is the set of neighbours of the user u
• p(i) and p(v) are considered uniform
• p(i|u) is computed smoothing pml(i|u) =
ru,i
j∈Iu
ru,j
Smoothing methods
Smoothing deals with data sparsity and plays a similar role to
the IDF using a background model: p(i|C) = v∈U rv,i
j∈I, v∈U rv,j
.
Jelinek-Mercer (JM) Linear interpolation. Parameter λ.
pλ(i|u) = (1 − λ) pml(i|u) + λ p(i|C) (3)
Dirichlet Priors (DP) Bayesian analysis. Parameter µ.
pµ(i|u) =
ru,i + µ p(i|C)
µ + j∈Iu
ru,j
(4)
Absolute Discounting (AD) Subtract a constant δ.
pδ(i|u) =
max(ru,i − δ, 0) + δ |Iu| p(i|C)
j∈Iu
ru,j
(5)
Experiments
0
0.05
0.1
0 100 200 300 400 500 600 700 800 900 1000
µ
RM1 + AD
RM1 + JM
RM1 + DP
RM2 + AD
RM2 + JM
RM2 + DP
0.25
0.3
0.35
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
P@5
λ / δ
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
01002003004005006007008009001000
0
0.2
0.4
0.6
0.8
1
P@5
δ
#ratings
P@5
0
0.2
0.4
0.6
0.8
1
Precision at 5 of the RM1 and the RM2 algorithms using Abso-
lute Discounting (AD), Jelinek-Mercer (JM) and Dirichlet pri-
ors (DP) smoothing methods for the MovieLens 100k dataset.
Precision at 5 of the RM2 algorithm using AD when varying
the smoothing intensity and considering different number of
ratings in the user profiles for the MovieLens 1M dataset.
Conclusions
• There are no big differences in terms of optimal pre-
cision among the studied smoothing techniques.
• Dirichlet priors and, specially, Jelinek-Mercer suffer a sig-
nificant decrease in precision when a high amount
of smoothing is applied.
• Absolute Discounting behaves almost as a
parameter-free smoothing method.
Bibliography
[1] J. Parapar, A. Bellogín, P. Castells, and A. Barreiro.
Relevance-based language modelling for recommender sys-
tems. IPM, 49(4):966–980, July 2013.
[2] C. Zhai and J. Lafferty. A study of smoothing methods
for language models applied to information retrieval. ACM
TOIS, 22(2):179–214, Apr. 2004.
ECIR 2015, 37th European Conference on Information Retrieval. 29 March - 2 April, 2015, Vienna, Austria