SlideShare uma empresa Scribd logo
1 de 34
Baixar para ler offline
X. Amatriain et. al
Rate It Again
Rate it Again
Increasing Recommendation Accuracy by
User re­Rating
Xavier Amatriain (with J.M. Pujol, N. Tintarev, N. Oliver)
Telefonica Research
Recsys 09
X. Amatriain et. al
Rate It Again
The Recommender Problem
● Two ways to address it
1. Improve the Algorithm
X. Amatriain et. al
Rate It Again
The Recommender Problem
● Two ways to address it
2. Improve the Input Data
Time for Data
Cleaning!
X. Amatriain et. al
Rate It Again
User Feedback is Noisy
● See our UMAP '09 Publication:
“I like it... I like it not” (Amatriain et al. '09)
X. Amatriain et. al
Rate It Again
Natural Noise Limits our User Model
DID YOU HEAR WHAT
I LIKE??!!
...and Our Prediction Accuracy
X. Amatriain et. al
Rate It Again
Experimental setup
● 118 participants rated movies in 3 trials
T1
(rand) <­> 24 h <­>T2
(pop.) <­> 15 days <­>T3
(rand)
● 100 Movies from Netflix dataset, stratified
random sampling on popularity
● Ratings on a 1 to 5 star scale with special “not
seen” symbol.
X. Amatriain et. al
Rate It Again
Users are Inconsistent
● What is the probability of making an inconsistency
given an original rating
X. Amatriain et. al
Rate It Again
Users are Inconsistent
● What is the percentage of inconsistencies given an
original rating
Mild ratings are
noisier
X. Amatriain et. al
Rate It Again
Users are Inconsistent
● What is the percentage of inconsistencies given an
original rating
Negative
ratings are
noisier
X. Amatriain et. al
Rate It Again
Prediction Accuracy
#Ti
#Tj
# RMSE
   
T1
, T2
2185 1961 1838 2308 0.573 0.707
T1
, T3
2185 1909 1774 2320 0.637 0.765
T2
, T3
1969 1909 1730 2140 0.557 0.694
● Pairwise RMSE between trials considering
intersection and union of both sets
X. Amatriain et. al
Rate It Again
Prediction Accuracy
#Ti
#Tj
# RMSE
   
T1
, T2
2185 1961 1838 2308 0.573 0.707
T1
, T3
2185 1909 1774 2320 0.637 0.765
T2
, T3
1969 1909 1730 2140 0.557 0.694
● Pairwise RMSE between trials considering
intersection and union of both sets
Max error in
trials that are
most distant in
time
X. Amatriain et. al
Rate It Again
Prediction Accuracy
#Ti
#Tj
# RMSE
   
T1
, T2
2185 1961 1838 2308 0.573 0.707
T1
, T3
2185 1909 1774 2320 0.637 0.765
T2
, T3
1969 1909 1730 2140 0.557 0.694
● Pairwise RMSE between trials considering
intersection and union of both sets
Significant less
error when 2nd
trial is involved
X. Amatriain et. al
Rate It Again
Algorithm Robustness to NN
Alg./Trial T1
T2
T3
Tworst
/Tbest
User
Average
1.2011 1.1469 1.1945 4.7%
Item
Average
1.0555 1.0361 1.0776 4%
User­based
kNN
0.9990 0.9640 1.0171 5.5%
Item­based
kNN
1.0429 1.0031 1.0417 4%
SVD 1.0244 0.9861 1.0285 4.3%
● RMSE for different Recommendation algorithms
when predicting each of the trials
X. Amatriain et. al
Rate It Again
Algorithm Robustness to NN
Alg./Trial T1
T2
T3
Tworst
/Tbest
User
Average
1.2011 1.1469 1.1945 4.7%
Item
Average
1.0555 1.0361 1.0776 4%
User­based
kNN
0.9990 0.9640 1.0171 5.5%
Item­based
kNN
1.0429 1.0031 1.0417 4%
SVD 1.0244 0.9861 1.0285 4.3%
● RMSE for different Recommendation algorithms
when predicting each of the trials
Trial 2 is
consistently the
least noisy
X. Amatriain et. al
Rate It Again
Algorithm Robustness to NN (2)
Training­Testing
Dataset
T1
-T2
T1
-T3
T2
-T3
User Average 1.1585 1.2095 1.2036
Movie Average 1.0305 1.0648 1.0637
User­based kNN 0.9693 1.0143 1.0184
Item­based kNN 1.0009 1.0406 1.0590
SVD 0.9741 1.0491 1.0118
● RMSE for different Recommendation algorithms
when predicting ratings in one trial (testing) from
ratings on another (training)
X. Amatriain et. al
Rate It Again
Algorithm Robustness to NN (2)
Training­Testing
Dataset
T1
-T2
T1
-T3
T2
-T3
User Average 1.1585 1.2095 1.2036
Movie Average 1.0305 1.0648 1.0637
User­based kNN 0.9693 1.0143 1.0184
Item­based kNN 1.0009 1.0406 1.0590
SVD 0.9741 1.0491 1.0118
● RMSE for different Recommendation algorithms
when predicting ratings in one trial (testing) from
ratings on another (training)
Noise is minimized
when we predict
Trial 2
X. Amatriain et. al
Rate It Again
Let's recap
● Users are inconsistent
● Inconsistencies can depend on many things
including how the items are presented
● Inconsistencies produce natural noise
● Natural noise reduces our prediction accuracy
independently of the algorithm
X. Amatriain et. al
Rate It Again
Hypothesis
● If we can somehow reduce natural noise due to
user inconsistencies we could greatly
improve recommendation accuracy.
● We can reduce natural noise by taking
advantage of user inconsistencies when re­
rating items.
X. Amatriain et. al
Rate It Again
Algorithm
● Given a rating dataset where (some) items
have been re­rated,
● Two fairness conditions:
1. Algorithm should remove as few ratings as
possible (i.e. only when there is some
certainty that the rating is only adding noise)
2.Algorithm should not make up new ratings but
decide on which of the existing ones are
valid.
X. Amatriain et. al
Rate It Again
Algorithm
● One source re­rating case:
● Given the following milding function:
X. Amatriain et. al
Rate It Again
Results
● One­source re­rating (Denoised Denoising)
⊚
T1
⊚T2
ΔT1
T1
⊚T3
ΔT1
T2
⊚T3
ΔT2
User­based kNN 0.8861 11.3% 0.8960 10.3% 0.8984 6.8%
SVD 0.9121 11.0% 0.9274 9.5% 0.9159 7.1%
Datasets T1
(
⊚ T2
, T3
) ΔT1
User­based kNN 0.8647 13.4%
SVD 0.8800 14.1%
● Two­source re­rating (Denoising T1
with the other 2)
X. Amatriain et. al
Rate It Again
Results
● One­source re­rating (Denoised Denoising)
⊚
T1
⊚T2
ΔT1
T1
⊚T3
ΔT1
T2
⊚T3
ΔT2
User­based kNN 0.8861 11.3% 0.8960 10.3% 0.8984 6.8%
SVD 0.9121 11.0% 0.9274 9.5% 0.9159 7.1%
Datasets T1
(
⊚ T2
, T3
) ΔT1
User­based kNN 0.8647 13.4%
SVD 0.8800 14.1%
● Two­source re­rating (Denoising T1
with the other 2)
Best results (above 10%!)
when denoising noisy trial
with less noisy
X. Amatriain et. al
Rate It Again
Results
● One­way re­rating (Denoised Denoising)
⊚
T1
⊚T2
ΔT1
T1
⊚T3
ΔT1
T2
⊚T3
ΔT2
User­based kNN 0.8861 11.3% 0.8960 10.3% 0.8984 6.8%
SVD 0.9121 11.0% 0.9274 9.5% 0.9159 7.1%
Datasets T1
(
⊚ T2
, T3
) ΔT1
User­based kNN 0.8647 13.4%
SVD 0.8800 14.1%
● Two­way re­rating (Denoising T1
with the other 2)
Smaller (yet important)
improvement when
denoising less noisy set
X. Amatriain et. al
Rate It Again
Results
● One­way re­rating (Denoised Denoising)
⊚
T1
⊚T2
ΔT1
T1
⊚T3
ΔT1
T2
⊚T3
ΔT2
User­based kNN 0.8861 11.3% 0.8960 10.3% 0.8984 6.8%
SVD 0.9121 11.0% 0.9274 9.5% 0.9159 7.1%
Datasets T1
(
⊚ T2
, T3
) ΔT1
User­based kNN 0.8647 13.4%
SVD 0.8800 14.1%
● Two­way re­rating (Denoising T1
with the other 2)
Improvements
up to 14% with
2 re­ratings!
X. Amatriain et. al
Rate It Again
But...
● We can't expect all users to re­rate all items
once or twice to improve accuracy!
● Need to devise methods to selectively choose
which ratings to denoise:
– Random selection
– Data­dependent (select ratings based on values)
– User­dependent (select ratings based on how
“noisy” user is)
X. Amatriain et. al
Rate It Again
Random re­rating
● Improvement in RMSE when doing once­source (left) and
two­source (right) re­rating as a function of the percentage
of randomly­selected denoised ratings (T1
⊚T3
)
X. Amatriain et. al
Rate It Again
Random re­rating
● Improvement in RMSE when doing once­source (left) and
two­source (right) re­rating as a function of the percentage
of randomly­selected denoised ratings (T1
⊚T3
)
X. Amatriain et. al
Rate It Again
Denoise Extreme Ratings
● Improvement in RMSE when doing once­source (left)
and two­source (right) re­rating as a function of the
percentage of denoised ratings: selecting only extreme
X. Amatriain et. al
Rate It Again
Denoise Extreme Ratings
● Improvement in RMSE when doing once­source (left)
and two­source (right) re­rating as a function of the
percentage of denoised ratings: selecting only extreme
X. Amatriain et. al
Rate It Again
Denoise outliers
● Improvement in RMSE when doing once­source (left) and two­
source (right) re­rating as a function of the percentage of denoised
ratings and users: selecting only noisy users and extreme ratings
X. Amatriain et. al
Rate It Again
Denoise outliers
● Improvement in RMSE when doing once­source (left) and two­
source (right) re­rating as a function of the percentage of denoised
ratings and users: selecting only noisy users and extreme ratings
X. Amatriain et. al
Rate It Again
Value of Rating
● Is it worth to add new ratings or re­rate existing items?
RMSE improvement as a function of new ratings added
in each case.
An extreme re­
rating improves
RMSE 10 times
more than adding a
new rating!
X. Amatriain et. al
Rate It Again
Conclusions
● Improving data can be more beneficial than
improving the algorithm
● Natural noise limits the accuracy of Recommender
Systems
● We can reduce natural noise by asking users to re­rate
items
● There are strategies to minimize the impact of the re­
rating process
● The value of a re­rate may be higher than that of a
new rating
X. Amatriain et. al
Rate It Again
Rate it Again
Increasing Recommendation Accuracy by
User re­Rating
Thanks!

Mais conteúdo relacionado

Destaque

MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the CloudMMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the CloudXavier Amatriain
 
Time-dependand Recommendation based on Implicit Feedback
Time-dependand Recommendation based on Implicit FeedbackTime-dependand Recommendation based on Implicit Feedback
Time-dependand Recommendation based on Implicit FeedbackXavier Amatriain
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedXavier Amatriain
 
Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueXavier Amatriain
 
Media, data, context... and the Holy Grail of User Taste Prediction
Media, data, context... and the Holy Grail of User Taste PredictionMedia, data, context... and the Holy Grail of User Taste Prediction
Media, data, context... and the Holy Grail of User Taste PredictionXavier Amatriain
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesXavier Amatriain
 
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorial
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorialBuilding Large-scale Real-world Recommender Systems - Recsys2012 tutorial
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorialXavier Amatriain
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixXavier Amatriain
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleXavier Amatriain
 
The Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender SystemsThe Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender SystemsXavier Amatriain
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsXavier Amatriain
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraXavier Amatriain
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisitedXavier Amatriain
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's KnowledgeXavier Amatriain
 

Destaque (20)

MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the CloudMMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
 
The Allosphere
The AllosphereThe Allosphere
The Allosphere
 
Time-dependand Recommendation based on Implicit Feedback
Time-dependand Recommendation based on Implicit FeedbackTime-dependand Recommendation based on Implicit Feedback
Time-dependand Recommendation based on Implicit Feedback
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons Learned
 
The CLAM Framework
The CLAM FrameworkThe CLAM Framework
The CLAM Framework
 
Cikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business ValueCikm 2013 - Beyond Data From User Information to Business Value
Cikm 2013 - Beyond Data From User Information to Business Value
 
Media, data, context... and the Holy Grail of User Taste Prediction
Media, data, context... and the Holy Grail of User Taste PredictionMedia, data, context... and the Holy Grail of User Taste Prediction
Media, data, context... and the Holy Grail of User Taste Prediction
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven Companies
 
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorial
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorialBuilding Large-scale Real-world Recommender Systems - Recsys2012 tutorial
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorial
 
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at NetflixMLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
 
Learning to fly
Learning to flyLearning to fly
Learning to fly
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 
Agile Science
Agile ScienceAgile Science
Agile Science
 
The Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender SystemsThe Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender Systems
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@Quora
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisited
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's Knowledge
 

Mais de Xavier Amatriain

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthXavier Amatriain
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19Xavier Amatriain
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateXavier Amatriain
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachXavier Amatriain
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsXavier Amatriain
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneXavier Amatriain
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AIXavier Amatriain
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyXavier Amatriain
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicineXavier Amatriain
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In IndustryXavier Amatriain
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender SystemXavier Amatriain
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectiveXavier Amatriain
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldXavier Amatriain
 
Machine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleMachine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleXavier Amatriain
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConfXavier Amatriain
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systemsXavier Amatriain
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 
Recsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedRecsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedXavier Amatriain
 

Mais de Xavier Amatriain (20)

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealth
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 update
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approach
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for Everyone
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AI
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategy
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicine
 
ML to cure the world
ML to cure the worldML to cure the world
ML to cure the world
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender System
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning World
 
Machine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleMachine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora Example
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
Recsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedRecsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem Revisited
 

Rate it Again

  • 1. X. Amatriain et. al Rate It Again Rate it Again Increasing Recommendation Accuracy by User re­Rating Xavier Amatriain (with J.M. Pujol, N. Tintarev, N. Oliver) Telefonica Research Recsys 09
  • 2. X. Amatriain et. al Rate It Again The Recommender Problem ● Two ways to address it 1. Improve the Algorithm
  • 3. X. Amatriain et. al Rate It Again The Recommender Problem ● Two ways to address it 2. Improve the Input Data Time for Data Cleaning!
  • 4. X. Amatriain et. al Rate It Again User Feedback is Noisy ● See our UMAP '09 Publication: “I like it... I like it not” (Amatriain et al. '09)
  • 5. X. Amatriain et. al Rate It Again Natural Noise Limits our User Model DID YOU HEAR WHAT I LIKE??!! ...and Our Prediction Accuracy
  • 6. X. Amatriain et. al Rate It Again Experimental setup ● 118 participants rated movies in 3 trials T1 (rand) <­> 24 h <­>T2 (pop.) <­> 15 days <­>T3 (rand) ● 100 Movies from Netflix dataset, stratified random sampling on popularity ● Ratings on a 1 to 5 star scale with special “not seen” symbol.
  • 7. X. Amatriain et. al Rate It Again Users are Inconsistent ● What is the probability of making an inconsistency given an original rating
  • 8. X. Amatriain et. al Rate It Again Users are Inconsistent ● What is the percentage of inconsistencies given an original rating Mild ratings are noisier
  • 9. X. Amatriain et. al Rate It Again Users are Inconsistent ● What is the percentage of inconsistencies given an original rating Negative ratings are noisier
  • 10. X. Amatriain et. al Rate It Again Prediction Accuracy #Ti #Tj # RMSE     T1 , T2 2185 1961 1838 2308 0.573 0.707 T1 , T3 2185 1909 1774 2320 0.637 0.765 T2 , T3 1969 1909 1730 2140 0.557 0.694 ● Pairwise RMSE between trials considering intersection and union of both sets
  • 11. X. Amatriain et. al Rate It Again Prediction Accuracy #Ti #Tj # RMSE     T1 , T2 2185 1961 1838 2308 0.573 0.707 T1 , T3 2185 1909 1774 2320 0.637 0.765 T2 , T3 1969 1909 1730 2140 0.557 0.694 ● Pairwise RMSE between trials considering intersection and union of both sets Max error in trials that are most distant in time
  • 12. X. Amatriain et. al Rate It Again Prediction Accuracy #Ti #Tj # RMSE     T1 , T2 2185 1961 1838 2308 0.573 0.707 T1 , T3 2185 1909 1774 2320 0.637 0.765 T2 , T3 1969 1909 1730 2140 0.557 0.694 ● Pairwise RMSE between trials considering intersection and union of both sets Significant less error when 2nd trial is involved
  • 13. X. Amatriain et. al Rate It Again Algorithm Robustness to NN Alg./Trial T1 T2 T3 Tworst /Tbest User Average 1.2011 1.1469 1.1945 4.7% Item Average 1.0555 1.0361 1.0776 4% User­based kNN 0.9990 0.9640 1.0171 5.5% Item­based kNN 1.0429 1.0031 1.0417 4% SVD 1.0244 0.9861 1.0285 4.3% ● RMSE for different Recommendation algorithms when predicting each of the trials
  • 14. X. Amatriain et. al Rate It Again Algorithm Robustness to NN Alg./Trial T1 T2 T3 Tworst /Tbest User Average 1.2011 1.1469 1.1945 4.7% Item Average 1.0555 1.0361 1.0776 4% User­based kNN 0.9990 0.9640 1.0171 5.5% Item­based kNN 1.0429 1.0031 1.0417 4% SVD 1.0244 0.9861 1.0285 4.3% ● RMSE for different Recommendation algorithms when predicting each of the trials Trial 2 is consistently the least noisy
  • 15. X. Amatriain et. al Rate It Again Algorithm Robustness to NN (2) Training­Testing Dataset T1 -T2 T1 -T3 T2 -T3 User Average 1.1585 1.2095 1.2036 Movie Average 1.0305 1.0648 1.0637 User­based kNN 0.9693 1.0143 1.0184 Item­based kNN 1.0009 1.0406 1.0590 SVD 0.9741 1.0491 1.0118 ● RMSE for different Recommendation algorithms when predicting ratings in one trial (testing) from ratings on another (training)
  • 16. X. Amatriain et. al Rate It Again Algorithm Robustness to NN (2) Training­Testing Dataset T1 -T2 T1 -T3 T2 -T3 User Average 1.1585 1.2095 1.2036 Movie Average 1.0305 1.0648 1.0637 User­based kNN 0.9693 1.0143 1.0184 Item­based kNN 1.0009 1.0406 1.0590 SVD 0.9741 1.0491 1.0118 ● RMSE for different Recommendation algorithms when predicting ratings in one trial (testing) from ratings on another (training) Noise is minimized when we predict Trial 2
  • 17. X. Amatriain et. al Rate It Again Let's recap ● Users are inconsistent ● Inconsistencies can depend on many things including how the items are presented ● Inconsistencies produce natural noise ● Natural noise reduces our prediction accuracy independently of the algorithm
  • 18. X. Amatriain et. al Rate It Again Hypothesis ● If we can somehow reduce natural noise due to user inconsistencies we could greatly improve recommendation accuracy. ● We can reduce natural noise by taking advantage of user inconsistencies when re­ rating items.
  • 19. X. Amatriain et. al Rate It Again Algorithm ● Given a rating dataset where (some) items have been re­rated, ● Two fairness conditions: 1. Algorithm should remove as few ratings as possible (i.e. only when there is some certainty that the rating is only adding noise) 2.Algorithm should not make up new ratings but decide on which of the existing ones are valid.
  • 20. X. Amatriain et. al Rate It Again Algorithm ● One source re­rating case: ● Given the following milding function:
  • 21. X. Amatriain et. al Rate It Again Results ● One­source re­rating (Denoised Denoising) ⊚ T1 ⊚T2 ΔT1 T1 ⊚T3 ΔT1 T2 ⊚T3 ΔT2 User­based kNN 0.8861 11.3% 0.8960 10.3% 0.8984 6.8% SVD 0.9121 11.0% 0.9274 9.5% 0.9159 7.1% Datasets T1 ( ⊚ T2 , T3 ) ΔT1 User­based kNN 0.8647 13.4% SVD 0.8800 14.1% ● Two­source re­rating (Denoising T1 with the other 2)
  • 22. X. Amatriain et. al Rate It Again Results ● One­source re­rating (Denoised Denoising) ⊚ T1 ⊚T2 ΔT1 T1 ⊚T3 ΔT1 T2 ⊚T3 ΔT2 User­based kNN 0.8861 11.3% 0.8960 10.3% 0.8984 6.8% SVD 0.9121 11.0% 0.9274 9.5% 0.9159 7.1% Datasets T1 ( ⊚ T2 , T3 ) ΔT1 User­based kNN 0.8647 13.4% SVD 0.8800 14.1% ● Two­source re­rating (Denoising T1 with the other 2) Best results (above 10%!) when denoising noisy trial with less noisy
  • 23. X. Amatriain et. al Rate It Again Results ● One­way re­rating (Denoised Denoising) ⊚ T1 ⊚T2 ΔT1 T1 ⊚T3 ΔT1 T2 ⊚T3 ΔT2 User­based kNN 0.8861 11.3% 0.8960 10.3% 0.8984 6.8% SVD 0.9121 11.0% 0.9274 9.5% 0.9159 7.1% Datasets T1 ( ⊚ T2 , T3 ) ΔT1 User­based kNN 0.8647 13.4% SVD 0.8800 14.1% ● Two­way re­rating (Denoising T1 with the other 2) Smaller (yet important) improvement when denoising less noisy set
  • 24. X. Amatriain et. al Rate It Again Results ● One­way re­rating (Denoised Denoising) ⊚ T1 ⊚T2 ΔT1 T1 ⊚T3 ΔT1 T2 ⊚T3 ΔT2 User­based kNN 0.8861 11.3% 0.8960 10.3% 0.8984 6.8% SVD 0.9121 11.0% 0.9274 9.5% 0.9159 7.1% Datasets T1 ( ⊚ T2 , T3 ) ΔT1 User­based kNN 0.8647 13.4% SVD 0.8800 14.1% ● Two­way re­rating (Denoising T1 with the other 2) Improvements up to 14% with 2 re­ratings!
  • 25. X. Amatriain et. al Rate It Again But... ● We can't expect all users to re­rate all items once or twice to improve accuracy! ● Need to devise methods to selectively choose which ratings to denoise: – Random selection – Data­dependent (select ratings based on values) – User­dependent (select ratings based on how “noisy” user is)
  • 26. X. Amatriain et. al Rate It Again Random re­rating ● Improvement in RMSE when doing once­source (left) and two­source (right) re­rating as a function of the percentage of randomly­selected denoised ratings (T1 ⊚T3 )
  • 27. X. Amatriain et. al Rate It Again Random re­rating ● Improvement in RMSE when doing once­source (left) and two­source (right) re­rating as a function of the percentage of randomly­selected denoised ratings (T1 ⊚T3 )
  • 28. X. Amatriain et. al Rate It Again Denoise Extreme Ratings ● Improvement in RMSE when doing once­source (left) and two­source (right) re­rating as a function of the percentage of denoised ratings: selecting only extreme
  • 29. X. Amatriain et. al Rate It Again Denoise Extreme Ratings ● Improvement in RMSE when doing once­source (left) and two­source (right) re­rating as a function of the percentage of denoised ratings: selecting only extreme
  • 30. X. Amatriain et. al Rate It Again Denoise outliers ● Improvement in RMSE when doing once­source (left) and two­ source (right) re­rating as a function of the percentage of denoised ratings and users: selecting only noisy users and extreme ratings
  • 31. X. Amatriain et. al Rate It Again Denoise outliers ● Improvement in RMSE when doing once­source (left) and two­ source (right) re­rating as a function of the percentage of denoised ratings and users: selecting only noisy users and extreme ratings
  • 32. X. Amatriain et. al Rate It Again Value of Rating ● Is it worth to add new ratings or re­rate existing items? RMSE improvement as a function of new ratings added in each case. An extreme re­ rating improves RMSE 10 times more than adding a new rating!
  • 33. X. Amatriain et. al Rate It Again Conclusions ● Improving data can be more beneficial than improving the algorithm ● Natural noise limits the accuracy of Recommender Systems ● We can reduce natural noise by asking users to re­rate items ● There are strategies to minimize the impact of the re­ rating process ● The value of a re­rate may be higher than that of a new rating
  • 34. X. Amatriain et. al Rate It Again Rate it Again Increasing Recommendation Accuracy by User re­Rating Thanks!