Recommender systems are crucial components of most commercial websites to keep users satisfied and to increase revenue. Thus, a lot of effort is made to improve recommendation accuracy. But when is the best possible performance of the recommender reached? The magic barrier, refers to some unknown level of prediction accuracy a recommender system can attain. The magic barrier reveals whether there is still room for improving prediction accuracy or indicates that further improvement is meaningless. In this work, we present a mathematical characterization of the magic barrier based on the assumption that user ratings are afflicted with inconsistencies - noise. In a case study with a commercial movie recommender, we investigate the inconsistencies of the user ratings and estimate the magic barrier in order to assess the actual quality of the recommender system.
Users and Noise: The Magic Barrier of Recommender Systems
1. Users and Noise: The Magic Barrier of Recommender Systems
Alan Said, Brijnesh J. Jain, Sascha Narr, Till Plumbaum
Competence Center Information Retrieval & Machine Learning
@alansaid, @saschanarr, @matip
2. Outline
► The Magic Barrier
► Empirical Risk Minimization
► Deriving the Magic Barrier
► User Study
► Conclusion
20 July 2012 The Magic Barrier 2
4. The Magic Barrier
► No magic involved....
► Coined by Herlocker et al. in 2004
“...an algorithm cannot be more accurate than the variance in
a user’s ratings for the same item.”
The maximum level of prediction that a recommender
algorithm can attain.
► What does this mean?
20 July 2012 The Magic Barrier 4
6. The Magic Barrier
► Even a “perfect” recommender should not reach RMSE = 0 or
Precision @ N = 1
► Why?
People are inconsistent and noisy in their ratings
“perfect” accuracy is not perfect
► So?
Knowing the highest possible level of accuracy, we can stop
optimizing our algorithms at “perfect” (before overfitting)
20 July 2012 The Magic Barrier 6
7. The Magic Barrier
So – how do we find the magic barrier?
We employ the Empirical Risk Minimization principle and a
statistical model for user inconsistencies
20 July 2012 The Magic Barrier 7
8. The Magic Barrier – User Inconsistencies
Assumption:
If a user were to re-rate all previously rated items, keeping in
mind the inconsistency, the ratings would differ, i.e.
𝑟 𝑢𝑖 = 𝜇 𝑢𝑖 + 𝜀 𝑢𝑖
where
𝜇 𝑢𝑖 is the expected rating, and
𝜀 𝑢𝑖 the rating error (has zero mean)
20 July 2012 The Magic Barrier 8
9. Empirical Risk Minimization
► … is a principle in statistical learning theory which defines a
family of learning algorithms and is used to give theoretical
bounds on the performance of learning
algorithms.[Wikipedia]
20 July 2012 The Magic Barrier 9
10. Empirical Risk Minimization
► We formulate our risk function as
𝑅 𝑓 = 𝑢,𝑖,𝑟 𝑝 𝑢, 𝑖, 𝑟 𝑓 𝑢, 𝑖 − 𝑟 2 The prediction error
The probability of user u rating item i with score r
► Keeping the assumption in mind, we formulate the risk for a
true, unknown, rating function as the sum of the noise
variance, i.e.
𝑅 𝑓∗ = 𝑢,𝑖 𝑝 𝑢, 𝑖 𝕍 𝜀 𝑢𝑖
where 𝕍 𝜀 𝑢𝑖 is the noise variance
20 July 2012 The Magic Barrier 10
11. Deriving the Magic Barrier
► We want to express the risk function in terms of a magic barrier
for RMSE – we take the root of the risk function
ℬ 𝒰×ℐ = 𝑢,𝑖 𝑝 𝑢, 𝑖 𝕍 𝜀 𝑢𝑖
RMSE=0 iff 𝜀 𝑢𝑖 = 0 over all ratings users and items
► In terms of RMSE we can express this as
𝐸 𝑅𝑀𝑆𝐸 𝑓 = ℬ 𝒰×ℐ + 𝐸 𝑓 > ℬ 𝒰×ℐ
where 𝐸 𝑓 is the error
20 July 2012 The Magic Barrier 11
12. Estimating the Magic Barrier
1. For each user-item pair in our population
a) Sample ratings on a regular basis, i.e. re-ratings
b) Estimate the expected value of ratings
𝑚
1
𝜇 𝑢𝑖 = 𝑟 𝑡 𝑢𝑖
𝑚
𝑡=1
c. Estimate the rating variance
𝑚
1 2
𝜀 𝑢𝑖 2
=
𝑚
𝜇 𝑢𝑖 − 𝑟𝑡 𝑢𝑖
𝑡=1
2. Estimate the magic barrier by taking the average
1
ℬ= 𝜀 𝑢𝑖 2
𝒳
𝑢𝑖 ∈𝒳
20 July 2012 The Magic Barrier 12
14. A User Study
► We teamed up with moviepilot.de
Germany’s largest online movie recommendation community
Ratings scale 1-10 stars (Netflix: 1-5 stars)
► Created a re-rating UI
Users were asked to re-rate at least 20 movies
1 new rating (so-called opinions) per movie
Collected data:
306 users
6,299 new opinions
2,329 movies
20 July 2012 The Magic Barrier 14
15. A User Study
User study moviepilot
20 July 2012 The Magic Barrier 15
16. A User Study
~4 ratings steps Room for improvement
~1 rating steps
Predictions vs Ratings above Ratings below
Ratings user’s average user’s average
Overall Opinions above Opinions below
Magic Barrier user’s average user’s average
20 July 2012 The Magic Barrier 16
17. Conclusion
► We created a mathematical characterization of the magic
barrier
► We performed a user study on a commercial movie
recommendation website and estimated its magic barrier
► We concluded the commercial recommender engine still has
room for improvement
► No magic
20 July 2012 The Magic Barrier 17
18. More?
► Estimating the Magic Barrier of Recommender Systems: A User Study
SIGIR 2012
► Magic Barrier explained
http://irml.dailab.de
► Movie rating and explanation user study
http://j.mp/ratingexplain
► Recommender Systems Wiki
www.recsyswiki.com
► Recommender Systems Challenge
www.recsyschallenge.com
20 July 2012 The Magic Barrier 18
19. Questions?
► Thank You for Listening!
20 July 2012 The Magic Barrier 19