SlideShare uma empresa Scribd logo
1 de 34
Baixar para ler offline
Recommender Systems
    Challenges
      Best Practices
     Tutorial & Panel


       ACM RecSys 2012
           Dublin
         September 10, 2012
About us
•   Alan Said - PhD Student @ TU-Berlin
    o   Topics: RecSys Evaluation
    o   @alansaid
    o   URL: www.alansaid.com


•   Domonkos Tikk - CEO @ Gravity R&D
    o   Topics: Machine Learning methods for RecSys
    o   @domonkostikk
    o   http://www.tmit.bme.hu/tikk.domonkos


•   Andreas Hotho - Prof. @ Uni. Würzburg
    o   Topics: Data Mining, Information Retrieval, Web Science
    o   http://www.is.informatik.uni-wuerzburg.de/staff/hotho
General Motivation
"RecSys is nobody's home conference. We
  come from CHI, IUI, SIGIR, etc."
  Joe Konstan - RecSys 2010


RecSys is our home conference - we
should evaluate accordingly!
Outline
•   Tutorial
    o Introduction to concepts in challenges
    o Execution of a challenge
    o Conclusion

•   Panel
      Experiences of participating in and
      organizing challenges
         Yehuda Koren
         Darren Vengroff
         Torben Brodt
What is the motivation
for RecSys Challenges?
          Part 1
Setup - information overload




users


                      content of service
                          provider
        recommender
Motivation of stakeholders
find relevant content
easy navigation
serendipity, discovery

  user                                service

                                    increase revenue
                                    target user with
                    recom               the right content
                                    engage users
 facilitate goals of stakeholders
 get recognized
Evaluation in terms of the business
                           business
                           reporting




Online evaluation
   (A/B test)
                      Casting into a
                    research problem
Context of the contest
•   Selection of metrics
•   Domain dependent
•   Offline vs. online evaluation


•   IR centric evaluation
     o RMSE
     o MAP
     o F1
Latent user needs
Recsys Competition Highlights
                          •   Large scale
                          •   Organization
                          •   RMSE
•   3-stage setup         •   Prize
•   selection by review
•   runtime limits
•   real traffic
•   revenue increase
                          •   offline
                          •   MAP@500
                          •   metadata available
                          •   larger in dimensions
                          •   no ratings
Recurring Competitions
•   ACM KDD Cup (2007, 2011, 2012)
•   ECML/PKDD Discovery Challenge (2008
    onwards)
    o 2008 and 09: tag recommendation in social
      bookmarking (incl. online evaluation task)
    o 2011: video lectures
•   CAMRa (2010, 2011, 2012)
Does size matter?
•   Yes! – real world users
•   In research – to some extent
Research & Industry
Important for both
• Industry has the data and research needs
  data
• Industry needs better approaches but this
  costs
• Research has ideas but has no systems
  and/or data to do the evaluation

Don't exploit participants
Don't be too greedy
Running a Challenge
       Part 2
Standard Challenge Setting
•   organizer defines the recommender setting e.g.
    tag recommendation in BibSonomy
•   provide data
    o   with features or
    o   raw data
    o   construct your own data
•   fix the way to do the evaluation
•   define the goal e.g. reach a certain
    improvement (F1)
•   motivate people to participate:
    e.g. promise a lot of money ;-)
Typical contest settings
 •   offline
     o   everyone gets access to the dataset
     o   in principle it is a prediction task, the user can't be influenced
     o   privacy of the user within the data is a big issue
     o   results from offline experimentation have limited predictive power
         for online user behavior

 •   online
     o   after a first learning phase the recommender is plugged into a real
         system
     o   user can be influenced but only by the selected system
     o   comparison of different system is not completely fair

 •   further ways
     o   user study
Example online setting
(BibSonomy)




BALBY MARINHO, L. ; HOTHO, A. ; JÄSCHKE, R. ; NANOPOULOS, A. ; RENDLE, S. ; SCHMIDT-THIEME, L. ; STUMME, G. ; SYMEONIDIS, P.:
Recommender Systems for Social Tagging Systems : SPRINGER, 2012 (SpringerBriefs in Electrical and Computer Engineering). - ISBN 978-1-
4614-1893-1
Which evaluation measures?
•   Root Mean Squared Error (RMSE)
•   Mean Absolute Error (MAE)
•   Typical IR measures
    o   precision @ n-items
    o   recall @ n-items
    o   False Positive Rate
    o   F1 @ n-items
    o   Area Under the ROC Curve (AUC)
•   non-quality measures
    o   server answer time
    o   understandability of the results
Discussion of measures?
    RMSE - Precision
• RMSE is not necessarily the king of metrics
    as RMSE is easy to optimize on
•   What about Top-n?
•   but RMSE is not influenced by popularity as
    top-n

• What about user-centric stuff?
• Ranking-based measure in KDD Cup 2011,
    Track 2
Results influenced by ...

•   target of the recommendation (user, resources, etc...)
•   evaluation methodology (leave-one-out, time based split, random
    sample, cross validation)
•   evaluation measure
•   design of the application (online setting)
•   the selected part of the data and its preprocessing (e.g.
    p-core vs. long tail)
•   scalability vs. quality of the model
•   feature and content accessible and usable for the
    recommendation
Don't forget..
• the effort to organize a challenge is very big
• preparing data takes time
• answering questions takes even more time
• participants are creative, needs for reaction
• time to compute the evaluation and check the
    results
•   prepare proceedings with the outcome
•   ...
What have we learnt?
    Conclusion
        Part 3
Challenges are good since they...
•   ... are focused on solving a single problem
•   ... have many participants
•   ... create common evaluation criteria
•   ... have comparable results
•   ... bring real-world problems to research
•   ... make it easy to crown a winner
•   ... they are cheap (even with a 1M$ prize)
Is that the complete truth?




           No!
Is that the complete truth?
•   Why?
Because using standard information retrieval metrics we
cannot evaluate recommender system concepts like:
    • user interaction
    • perception
    • satisfaction
    • usefulness
    • any metric not based on accuracy/rating prediction
      and negative predictions
    • scalability
    • engineering
We can't catch everything offline
        Scalability

                      Presentation



                      Interaction
The difference between IR and RS
Information retrieval systems answer to a need


                 A Query
Recommender systems identify the user's needs
Should we organize more
challenges?
•   Yes - but before we do that, think of
    o What is the utility of Yet Another Dataset - aren't
      there enough already?
    o How do we create a real-world like challenge
    o How do we get real user feedback
Take home message
•   Real needs of users and content providers are better
    reflected in online evaluation

•   Consider technical limitations as well

•   Challenges advance the field a lot
    o Matrix factorization & ensemble methods in the
      Netflix Prize
    o Evaluation measure and objective in the KDD Cup
      2011
Related events at RecSys
•   Workshops
    o   Recommender Utility Evaluation
    o   RecSys Data Challenge
•   Paper Sessions
    o Multi-Objective Recommendation and Human
      Factors - Mon. 14:30
    o Implicit Feedback and User Preference - Tue. 11:00
    o Top-N Recommendation - Wed. 14:30

•   More challenges:
    o   www.recsyswiki.com/wiki/Category:Competition
Panel
Part 4
Panel
•   Torben Brodt
    o   Plista
    o   Organizing Plista Contest

•   Yehuda Koren
    o   Google
    o   Member of winning team of the Netflix Prize


•   Darren Vengroff
    o   RichRelevance
    o   Organizer of RecLab Prize
Questions
•   How does recommendation influence the
    user and system?
•   How can we quantify the effects of the UI?
•   How should we translate what we've
    presented into an actual challenge?
•   should we focus on the long tail or the short
    head?
•   Evaluation measures, click rate, wtf@k
•   How to evaluate conversion rate?

Mais conteúdo relacionado

Mais procurados

Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesDaniel Valcarce
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...PyData
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisitedXavier Amatriain
 
Aiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionAiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionDeepak Agarwal
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systemsguest77b0cd12
 
Product Recommendations Enhanced with Reviews
Product Recommendations Enhanced with ReviewsProduct Recommendations Enhanced with Reviews
Product Recommendations Enhanced with Reviewsmaranlar
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleXavier Amatriain
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringViet-Trung TRAN
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakDeepak Agarwal
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNNŞeyda Hatipoğlu
 
Data Mining and Recommendation Systems
Data Mining and Recommendation SystemsData Mining and Recommendation Systems
Data Mining and Recommendation SystemsSalil Navgire
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systemsyoualab
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systemsFalitokiniaina Rabearison
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender SystemsDavid Zibriczky
 
Recommendation System for Design Patterns in Software Development
Recommendation System for Design Patterns in Software DevelopmentRecommendation System for Design Patterns in Software Development
Recommendation System for Design Patterns in Software DevelopmentFrancis Palma
 

Mais procurados (20)

Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slides
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisited
 
Aiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversionAiinpractice2017deepaklongversion
Aiinpractice2017deepaklongversion
 
Social Recommender Systems
Social Recommender SystemsSocial Recommender Systems
Social Recommender Systems
 
Product Recommendations Enhanced with Reviews
Product Recommendations Enhanced with ReviewsProduct Recommendations Enhanced with Reviews
Product Recommendations Enhanced with Reviews
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and Deepak
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
Data Mining and Recommendation Systems
Data Mining and Recommendation SystemsData Mining and Recommendation Systems
Data Mining and Recommendation Systems
 
Survey of Recommendation Systems
Survey of Recommendation SystemsSurvey of Recommendation Systems
Survey of Recommendation Systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems[Final]collaborative filtering and recommender systems
[Final]collaborative filtering and recommender systems
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
 
Recommendation System for Design Patterns in Software Development
Recommendation System for Design Patterns in Software DevelopmentRecommendation System for Design Patterns in Software Development
Recommendation System for Design Patterns in Software Development
 

Semelhante a Best Practices in Recommender System Challenges

PAS: The Planning Quality Framework
PAS: The Planning Quality FrameworkPAS: The Planning Quality Framework
PAS: The Planning Quality FrameworkPAS_Team
 
Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...DRIscience
 
ECIR Recommendation Challenges
ECIR Recommendation ChallengesECIR Recommendation Challenges
ECIR Recommendation ChallengesDaniel Kohlsdorf
 
Value stream mapping for complex processes (innovation, Lean, service design)
Value stream mapping for complex processes (innovation, Lean, service design) Value stream mapping for complex processes (innovation, Lean, service design)
Value stream mapping for complex processes (innovation, Lean, service design) Teemu Toivonen
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Alan Said
 
Building Search and Personalization at Nordstrom Rack | Hautelook
Building Search and Personalization at Nordstrom Rack | HautelookBuilding Search and Personalization at Nordstrom Rack | Hautelook
Building Search and Personalization at Nordstrom Rack | HautelookLucidworks
 
Unlocking the value of customer data
Unlocking the value of customer dataUnlocking the value of customer data
Unlocking the value of customer dataJanessa Lantz
 
7.1 Mapping Your Processes to Deliver an Exceptional Student Experience
7.1 Mapping Your Processes to Deliver an Exceptional Student Experience7.1 Mapping Your Processes to Deliver an Exceptional Student Experience
7.1 Mapping Your Processes to Deliver an Exceptional Student ExperienceTargetX
 
Downloads abc 2006 presentation downloads-ramesh_babu
Downloads abc 2006   presentation downloads-ramesh_babuDownloads abc 2006   presentation downloads-ramesh_babu
Downloads abc 2006 presentation downloads-ramesh_babuHem Rana
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryMark Constable
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveoralonso
 
Knowledge Management in Healthcare Analytics
Knowledge Management in Healthcare AnalyticsKnowledge Management in Healthcare Analytics
Knowledge Management in Healthcare AnalyticsGregory Nelson
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeArushi Prakash, Ph.D.
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyMaya Hristakeva
 
Group 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxGroup 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxellamangapis2003
 
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventUsability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventKay Aubrey
 

Semelhante a Best Practices in Recommender System Challenges (20)

PAS: The Planning Quality Framework
PAS: The Planning Quality FrameworkPAS: The Planning Quality Framework
PAS: The Planning Quality Framework
 
Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...Establishing best practices to improve usefulness and usability of web interf...
Establishing best practices to improve usefulness and usability of web interf...
 
ECIR Recommendation Challenges
ECIR Recommendation ChallengesECIR Recommendation Challenges
ECIR Recommendation Challenges
 
Value stream mapping for complex processes (innovation, Lean, service design)
Value stream mapping for complex processes (innovation, Lean, service design) Value stream mapping for complex processes (innovation, Lean, service design)
Value stream mapping for complex processes (innovation, Lean, service design)
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
 
Building Search and Personalization at Nordstrom Rack | Hautelook
Building Search and Personalization at Nordstrom Rack | HautelookBuilding Search and Personalization at Nordstrom Rack | Hautelook
Building Search and Personalization at Nordstrom Rack | Hautelook
 
Unlocking the value of customer data
Unlocking the value of customer dataUnlocking the value of customer data
Unlocking the value of customer data
 
Dlf 2012
Dlf 2012Dlf 2012
Dlf 2012
 
7.1 Mapping Your Processes to Deliver an Exceptional Student Experience
7.1 Mapping Your Processes to Deliver an Exceptional Student Experience7.1 Mapping Your Processes to Deliver an Exceptional Student Experience
7.1 Mapping Your Processes to Deliver an Exceptional Student Experience
 
Downloads abc 2006 presentation downloads-ramesh_babu
Downloads abc 2006   presentation downloads-ramesh_babuDownloads abc 2006   presentation downloads-ramesh_babu
Downloads abc 2006 presentation downloads-ramesh_babu
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project Delivery
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
 
Knowledge Management in Healthcare Analytics
Knowledge Management in Healthcare AnalyticsKnowledge Management in Healthcare Analytics
Knowledge Management in Healthcare Analytics
 
The art of project estimation
The art of project estimationThe art of project estimation
The art of project estimation
 
PQF Overview
PQF OverviewPQF Overview
PQF Overview
 
Ch 3
Ch   3Ch   3
Ch 3
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science Resume
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Group 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxGroup 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptx
 
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter eventUsability Testing for Qualitative Researchers - QRCA NYC Chapter event
Usability Testing for Qualitative Researchers - QRCA NYC Chapter event
 

Mais de Alan Said

Replication of Recommender Systems Research
Replication of Recommender Systems ResearchReplication of Recommender Systems Research
Replication of Recommender Systems ResearchAlan Said
 
The Magic Barrier of Recommender Systems - No Magic, Just Ratings
The Magic Barrier of Recommender Systems - No Magic, Just RatingsThe Magic Barrier of Recommender Systems - No Magic, Just Ratings
The Magic Barrier of Recommender Systems - No Magic, Just RatingsAlan Said
 
A Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed SystemsA Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed SystemsAlan Said
 
Information Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System EvaluationInformation Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System EvaluationAlan Said
 
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...Alan Said
 
A 3D Approach to Recommender System Evaluation
A 3D Approach to Recommender System EvaluationA 3D Approach to Recommender System Evaluation
A 3D Approach to Recommender System EvaluationAlan Said
 
State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012Alan Said
 
RecSysChallenge Opening
RecSysChallenge OpeningRecSysChallenge Opening
RecSysChallenge OpeningAlan Said
 
Estimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyEstimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyAlan Said
 
Users and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsUsers and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsAlan Said
 
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Alan Said
 
CaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationCaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationAlan Said
 
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesPersonalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesAlan Said
 
Inferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceInferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceAlan Said
 
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityUsing Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityAlan Said
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsAlan Said
 

Mais de Alan Said (16)

Replication of Recommender Systems Research
Replication of Recommender Systems ResearchReplication of Recommender Systems Research
Replication of Recommender Systems Research
 
The Magic Barrier of Recommender Systems - No Magic, Just Ratings
The Magic Barrier of Recommender Systems - No Magic, Just RatingsThe Magic Barrier of Recommender Systems - No Magic, Just Ratings
The Magic Barrier of Recommender Systems - No Magic, Just Ratings
 
A Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed SystemsA Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
A Top-N Recommender System Evaluation Protocol Inspired by Deployed Systems
 
Information Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System EvaluationInformation Retrieval and User-centric Recommender System Evaluation
Information Retrieval and User-centric Recommender System Evaluation
 
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...
 
A 3D Approach to Recommender System Evaluation
A 3D Approach to Recommender System EvaluationA 3D Approach to Recommender System Evaluation
A 3D Approach to Recommender System Evaluation
 
State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012State of RecSys: Recap of RecSys 2012
State of RecSys: Recap of RecSys 2012
 
RecSysChallenge Opening
RecSysChallenge OpeningRecSysChallenge Opening
RecSysChallenge Opening
 
Estimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User StudyEstimating the Magic Barrier of Recommender Systems: A User Study
Estimating the Magic Barrier of Recommender Systems: A User Study
 
Users and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender SystemsUsers and Noise: The Magic Barrier of Recommender Systems
Users and Noise: The Magic Barrier of Recommender Systems
 
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
Analyzing Weighting Schemes in Collaborative Filtering: Cold Start, Post Cold...
 
CaRR 2012 Opening Presentation
CaRR 2012 Opening PresentationCaRR 2012 Opening Presentation
CaRR 2012 Opening Presentation
 
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending MoviesPersonalizing Tags: A Folksonomy-like Approach for Recommending Movies
Personalizing Tags: A Folksonomy-like Approach for Recommending Movies
 
Inferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender PerformanceInferring Contextual User Profiles - Improving Recommender Performance
Inferring Contextual User Profiles - Improving Recommender Performance
 
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityUsing Social- and Pseudo-Social Networks to Improve Recommendation Quality
Using Social- and Pseudo-Social Networks to Improve Recommendation Quality
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 

Último

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Best Practices in Recommender System Challenges

  • 1. Recommender Systems Challenges Best Practices Tutorial & Panel ACM RecSys 2012 Dublin September 10, 2012
  • 2. About us • Alan Said - PhD Student @ TU-Berlin o Topics: RecSys Evaluation o @alansaid o URL: www.alansaid.com • Domonkos Tikk - CEO @ Gravity R&D o Topics: Machine Learning methods for RecSys o @domonkostikk o http://www.tmit.bme.hu/tikk.domonkos • Andreas Hotho - Prof. @ Uni. Würzburg o Topics: Data Mining, Information Retrieval, Web Science o http://www.is.informatik.uni-wuerzburg.de/staff/hotho
  • 3. General Motivation "RecSys is nobody's home conference. We come from CHI, IUI, SIGIR, etc." Joe Konstan - RecSys 2010 RecSys is our home conference - we should evaluate accordingly!
  • 4. Outline • Tutorial o Introduction to concepts in challenges o Execution of a challenge o Conclusion • Panel Experiences of participating in and organizing challenges  Yehuda Koren  Darren Vengroff  Torben Brodt
  • 5. What is the motivation for RecSys Challenges? Part 1
  • 6. Setup - information overload users content of service provider recommender
  • 7. Motivation of stakeholders find relevant content easy navigation serendipity, discovery user service increase revenue target user with recom the right content engage users facilitate goals of stakeholders get recognized
  • 8. Evaluation in terms of the business business reporting Online evaluation (A/B test) Casting into a research problem
  • 9. Context of the contest • Selection of metrics • Domain dependent • Offline vs. online evaluation • IR centric evaluation o RMSE o MAP o F1
  • 11. Recsys Competition Highlights • Large scale • Organization • RMSE • 3-stage setup • Prize • selection by review • runtime limits • real traffic • revenue increase • offline • MAP@500 • metadata available • larger in dimensions • no ratings
  • 12. Recurring Competitions • ACM KDD Cup (2007, 2011, 2012) • ECML/PKDD Discovery Challenge (2008 onwards) o 2008 and 09: tag recommendation in social bookmarking (incl. online evaluation task) o 2011: video lectures • CAMRa (2010, 2011, 2012)
  • 13. Does size matter? • Yes! – real world users • In research – to some extent
  • 14. Research & Industry Important for both • Industry has the data and research needs data • Industry needs better approaches but this costs • Research has ideas but has no systems and/or data to do the evaluation Don't exploit participants Don't be too greedy
  • 16. Standard Challenge Setting • organizer defines the recommender setting e.g. tag recommendation in BibSonomy • provide data o with features or o raw data o construct your own data • fix the way to do the evaluation • define the goal e.g. reach a certain improvement (F1) • motivate people to participate: e.g. promise a lot of money ;-)
  • 17. Typical contest settings • offline o everyone gets access to the dataset o in principle it is a prediction task, the user can't be influenced o privacy of the user within the data is a big issue o results from offline experimentation have limited predictive power for online user behavior • online o after a first learning phase the recommender is plugged into a real system o user can be influenced but only by the selected system o comparison of different system is not completely fair • further ways o user study
  • 18. Example online setting (BibSonomy) BALBY MARINHO, L. ; HOTHO, A. ; JÄSCHKE, R. ; NANOPOULOS, A. ; RENDLE, S. ; SCHMIDT-THIEME, L. ; STUMME, G. ; SYMEONIDIS, P.: Recommender Systems for Social Tagging Systems : SPRINGER, 2012 (SpringerBriefs in Electrical and Computer Engineering). - ISBN 978-1- 4614-1893-1
  • 19. Which evaluation measures? • Root Mean Squared Error (RMSE) • Mean Absolute Error (MAE) • Typical IR measures o precision @ n-items o recall @ n-items o False Positive Rate o F1 @ n-items o Area Under the ROC Curve (AUC) • non-quality measures o server answer time o understandability of the results
  • 20. Discussion of measures? RMSE - Precision • RMSE is not necessarily the king of metrics as RMSE is easy to optimize on • What about Top-n? • but RMSE is not influenced by popularity as top-n • What about user-centric stuff? • Ranking-based measure in KDD Cup 2011, Track 2
  • 21. Results influenced by ... • target of the recommendation (user, resources, etc...) • evaluation methodology (leave-one-out, time based split, random sample, cross validation) • evaluation measure • design of the application (online setting) • the selected part of the data and its preprocessing (e.g. p-core vs. long tail) • scalability vs. quality of the model • feature and content accessible and usable for the recommendation
  • 22. Don't forget.. • the effort to organize a challenge is very big • preparing data takes time • answering questions takes even more time • participants are creative, needs for reaction • time to compute the evaluation and check the results • prepare proceedings with the outcome • ...
  • 23. What have we learnt? Conclusion Part 3
  • 24. Challenges are good since they... • ... are focused on solving a single problem • ... have many participants • ... create common evaluation criteria • ... have comparable results • ... bring real-world problems to research • ... make it easy to crown a winner • ... they are cheap (even with a 1M$ prize)
  • 25. Is that the complete truth? No!
  • 26. Is that the complete truth? • Why? Because using standard information retrieval metrics we cannot evaluate recommender system concepts like: • user interaction • perception • satisfaction • usefulness • any metric not based on accuracy/rating prediction and negative predictions • scalability • engineering
  • 27. We can't catch everything offline Scalability Presentation Interaction
  • 28. The difference between IR and RS Information retrieval systems answer to a need A Query Recommender systems identify the user's needs
  • 29. Should we organize more challenges? • Yes - but before we do that, think of o What is the utility of Yet Another Dataset - aren't there enough already? o How do we create a real-world like challenge o How do we get real user feedback
  • 30. Take home message • Real needs of users and content providers are better reflected in online evaluation • Consider technical limitations as well • Challenges advance the field a lot o Matrix factorization & ensemble methods in the Netflix Prize o Evaluation measure and objective in the KDD Cup 2011
  • 31. Related events at RecSys • Workshops o Recommender Utility Evaluation o RecSys Data Challenge • Paper Sessions o Multi-Objective Recommendation and Human Factors - Mon. 14:30 o Implicit Feedback and User Preference - Tue. 11:00 o Top-N Recommendation - Wed. 14:30 • More challenges: o www.recsyswiki.com/wiki/Category:Competition
  • 33. Panel • Torben Brodt o Plista o Organizing Plista Contest • Yehuda Koren o Google o Member of winning team of the Netflix Prize • Darren Vengroff o RichRelevance o Organizer of RecLab Prize
  • 34. Questions • How does recommendation influence the user and system? • How can we quantify the effects of the UI? • How should we translate what we've presented into an actual challenge? • should we focus on the long tail or the short head? • Evaluation measures, click rate, wtf@k • How to evaluate conversion rate?