SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
Comparing State-of-the-Art
                Collaborative Filtering Systems


                         Laurent Candillier, Frank Meyer, Marc Boull´
                                                                    e
Introduction

                                France Telecom R&D Lannion
Collaborative
approaches
                                         MLDM 2007
Experiments

Conclusions



                 1 Introduction

                 2 Collaborative approaches

                 3 Experiments

                 4 Conclusions
Recommender systems


                Help users find items they should appreciate from huge
                catalogues [Adomavicius and Tuzhilin, 2005]
Introduction

Collaborative
approaches
                ⇒ Collaborative filtering : based on user to item rating matrix
Experiments

Conclusions
                                        i1   i2   i3   i4   i5
                                         4    4              1
                                  u1
                                         4    3
                                  u2
                                         5             2    1
                                  u3
                                                       4    5
                                  u4
                                                  5    4
                                  u5
                                             5         3
                                  u6
                                        4    ?              1
                                  u7
User-based approaches

                Recommend items appreciated by users whose tastes are similar
                to the ones of the given user [Resnick et al., 1994]
Introduction

                ⇒ need a similarity measure between users
Collaborative
approaches
                ex : pearson similarity : cosine of deviation from the mean
Experiments

Conclusions

                                          i ∈Sa ∩Su (vai     − va )(vui − vu )
                    w (a, u) =
                                                     − va )2                         − vu )2
                                    i ∈Sa ∩Su (vai                  i ∈Sa ∩Su (vui

                    vui : rating of user u on item i
                    Su : set of items rated by user u
                    vu : mean rating of user u

                                                              vui
                                                     i ∈Su
                                        vu =
                                                      |Su |
User-based approaches



                Which rating for user a (active) on item i ?
Introduction

Collaborative
approaches
                Prediction using weighted sum
Experiments


                                         {u|i ∈Su } w (a, u) × vui
Conclusions
                                pai =
                                            {u|i ∈Su } |w (a, u)|

                Prediction using weighted sum of deviations from the mean

                                        {u|i ∈Su } w (a, u)   × (vui − vu )
                          pai = va +
                                               {u|i ∈Su } |w (a, u)|

                How many neighbors considered ?
Cluster-based approaches



                Recommend items appreciated by users that belong to the
Introduction

                same group as the given user [Breese et al., 1998]
Collaborative
approaches

Experiments
                ⇒ need
Conclusions
                    a clustering method : ex : K-means
                    a distance measure : ex : euclidian distance

                Then the rating of a user on an item is the mean rating given
                by the users that belong to the same cluster

                How many clusters considered ?
Item-based approaches


                Recommend items similar to those appreciated by the given
                user [Karypis, 2001]
Introduction

Collaborative
approaches
                ⇒ dual of user-based approach
Experiments

Conclusions
                                                                 × (vaj − vj )
                                       {j∈Sa |j=i } sim(i , j)
                         pai = vi +
                                              {j∈Sa |j=i } |sim(i , j)|

                    sim(i , j) : similarity measure between items i and j
                    Sa : set of items rated by user a
                    vi : mean rating on item i


                How many neighbors considered ?
Experiments

                For user- and item-based approaches, choose
                     similarity measure
                     prediction scheme
Introduction

Collaborative
                     neighborhood size K
approaches

                For cluster-based approaches, choose
Experiments

                     distance measure
Conclusions

                     prediction scheme
                     number of clusters
                Evaluation protocol [Herlocker et al., 2004]
                     movie rating dataset : MovieLens (6040 × 3706)
                     10-fold cross validation (10 × 9/10th for learning)
                     Mean Absolute Error Rate on test set T = {(u, i , r )}
                                             1
                                   MAE =                         |pui − r |
                                            |T |
                                                   (u,i ,r )∈T
User-based approaches, similarity measures



                        MAE
Introduction
                                                         Pearson
Collaborative
                                                       Constraint
approaches
                         0.8                              Cosine
Experiments
                                                        Adjusted
Conclusions
                                                           Proba
                        0.76


                        0.72


                        0.68

                               0   500   1000   1500   2000   2500   K
User-based approaches, prediction schemes



                        MAE
Introduction
                                                 PearsonWeighted
Collaborative
                                                 PearsonDeviation
approaches
                         0.8                      ProbaWeighted
Experiments
                                                  ProbaDeviation
Conclusions

                        0.76


                        0.72


                        0.68

                               0   500   1000   1500   2000   2500   K
Item-based approaches, similarity measures



                        MAE
Introduction
                                                             Pearson
Collaborative
                                                           Constraint
approaches
                        0.76                                  Cosine
Experiments
                                                            Adjusted
Conclusions
                                                               Proba
                        0.72


                        0.68


                        0.64

                               0   200   400   600   800 1000 1200 1400   K
Summary of experiments


                                        BestDefault   BestUser   BestItem   BestCluster
Introduction       model construction
                                            1           730        170         254
                     time (in sec.)
Collaborative
                    prediction time
approaches
                                            1           31          3           1
                        (in sec.)
Experiments

                         MAE              0.6829      0.6688      0.6382      0.6736
Conclusions




                    BestDefault : Bayes minimizing MAE
                    BestUser : pearson similarity, 1500 neighbors, prediction
                    using deviation from the mean
                    BestItem : probabilistic similarity, 400 neighbors,
                    prediction using deviation from the mean
                    BestCluster : K-means, euclidian distance, 4 clusters,
                    prediction using Bayes minimizing MAE
Conclusions



Introduction

Collaborative
                    All approaches, and all their possible options, are tested
approaches
                    under exactly the same conditions
Experiments

                    Bayes is a good compromise : low error rate, low
Conclusions

                    execution time, incremental
                    Deviation from the mean : better results, new for
                    item-based approaches
                    Similarity measures : pearson for user-based, probabilistic
                    for item-based
Conclusions



                The item-based approach
Introduction

Collaborative
                    get the best performances in the experiments
approaches

                    seems to need fewer neighbors than user-based approach
Experiments

Conclusions
                    is also appropriate to navigate in item catalogues even
                    with no user information
                    may naturally use content data about items to improve its
                    results (idem for user-based approach with demographic
                    data)
                    results depend on the number of items compared to the
                    number of users ?
Next



                Need to scale well even when faced with huge datasets
Introduction

                ex : netflix prize : 100,480,507 ratings from 480,189 users on
Collaborative
approaches
                17,770 movies
Experiments

                    select most relevant users [Yu et al., 2002]
Conclusions

                    reduce dimensionality with PCA or SVD
                    [Goldberg et al., 2001, Vozalis and Margaritis, 2005]
                    create a set of super-users [Rashid et al., 2006]
                    sampling ? stochastic ? bagging ?


                Combine approaches ⇒ ensemble methods [Polikar, 2006]
P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J.
                Riedl (1994)
                Grouplens: an open architecture for collaborative filtering
Introduction

                of netnews
Collaborative
approaches
                In Conference on Computer Supported Cooperative Work,
Experiments
                pages 175–186. ACM
Conclusions

                J. Breese, D. Heckerman and C. Kadie (1998)
                Empirical analysis of predictive algorithms for collaborative
                filtering
                In 14th Conference on Uncertainty in Artificial Intelligence,
                pages 43–52. Morgan Kaufman
                G. Karypis (2001)
                Evaluation of item-based top-N recommendation
                algorithms
In 10th International Conference on Information and
                Knowledge Management, pages 247–254
                K. Goldberg, T. Roeder, D. Gupta and C. Perkins (2001)
Introduction

                Eigentaste: a constant time collaborative filtering
Collaborative
approaches
                algorithm
Experiments
                Information Retrieval, 4(2):133–151
Conclusions

                K. Yu, X. Xu, J. Tao, M. Ester and H. Kriegel (2002)
                Instance selection techniques for memory-based
                collaborative filtering
                In SIAM Data Mining
                J. Herlocker, J. Konstan, L. Terveen and J. Riedl (2004)
                Evaluating collaborative filtering recommender systems
                ACM Transactions on Information Systems, 22(1):5–53
                G. Adomavicius and A. Tuzhilin (2005)
Toward the next generation of recommender systems: a
                survey of the state-of-the-art and possible extensions
                IEEE Transactions on Knowledge and Data Engineering,
Introduction
                17(6):734–749
Collaborative
approaches
                M. Vozalis and K. Margaritis (2005)
Experiments
                Applying SVD on item-based filtering
Conclusions

                In 5th International Conference on Intelligent Systems
                Design and Applications, pages 464–469
                A.M. Rashid, S.K. Lam, G. Karypis and J. Riedl (2006)
                ClustKNN: a highly scalable hybrid model- &
                memory-based CF algorithm
                In KDD Workshop on Web Mining and Web Usage Analysis
                R. Polikar (2006)
                Ensemble systems in decision making
                IEEE Circuits & Systems Magazine, 6(3):21–45

Mais conteúdo relacionado

Semelhante a Comparing State-of-the-Art Collaborative Filtering Systems

Harnessing Ratings and Aspect-Sentiment to Estimate Contradiction Intensity i...
Harnessing Ratings and Aspect-Sentiment to Estimate Contradiction Intensity i...Harnessing Ratings and Aspect-Sentiment to Estimate Contradiction Intensity i...
Harnessing Ratings and Aspect-Sentiment to Estimate Contradiction Intensity i...
Ismail BADACHE
 
Experimental research design.revised
Experimental research design.revisedExperimental research design.revised
Experimental research design.revised
Franz Dalluay
 
Finding and Quantifying Temporal-Aware Contradiction in Reviews
Finding and Quantifying Temporal-Aware Contradiction in ReviewsFinding and Quantifying Temporal-Aware Contradiction in Reviews
Finding and Quantifying Temporal-Aware Contradiction in Reviews
Ismail BADACHE
 

Semelhante a Comparing State-of-the-Art Collaborative Filtering Systems (20)

Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
 
Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]
Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]
Language Models for Collaborative Filtering Neighbourhoods [ECIR '16 Slides]
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender Systems
 
Computing Neighbourhoods with Language Models in a Collaborative Filtering Sc...
Computing Neighbourhoods with Language Models in a Collaborative Filtering Sc...Computing Neighbourhoods with Language Models in a Collaborative Filtering Sc...
Computing Neighbourhoods with Language Models in a Collaborative Filtering Sc...
 
Harnessing Ratings and Aspect-Sentiment to Estimate Contradiction Intensity i...
Harnessing Ratings and Aspect-Sentiment to Estimate Contradiction Intensity i...Harnessing Ratings and Aspect-Sentiment to Estimate Contradiction Intensity i...
Harnessing Ratings and Aspect-Sentiment to Estimate Contradiction Intensity i...
 
Experimental research design.revised
Experimental research design.revisedExperimental research design.revised
Experimental research design.revised
 
Replicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender SystemsReplicable Evaluation of Recommender Systems
Replicable Evaluation of Recommender Systems
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
Calibration approach for parameter estimation.pptx
Calibration approach for parameter estimation.pptxCalibration approach for parameter estimation.pptx
Calibration approach for parameter estimation.pptx
 
Probabilistic Collaborative Filtering with Negative Cross Entropy
Probabilistic Collaborative Filtering with Negative Cross EntropyProbabilistic Collaborative Filtering with Negative Cross Entropy
Probabilistic Collaborative Filtering with Negative Cross Entropy
 
The Impact of Formative Assessment on EFL Learners’ Vocabulary Enhancement by...
The Impact of Formative Assessment on EFL Learners’ Vocabulary Enhancement by...The Impact of Formative Assessment on EFL Learners’ Vocabulary Enhancement by...
The Impact of Formative Assessment on EFL Learners’ Vocabulary Enhancement by...
 
Introduzione ai differenti approcci alla stima dell'incertezza di misura Nari...
Introduzione ai differenti approcci alla stima dell'incertezza di misura Nari...Introduzione ai differenti approcci alla stima dell'incertezza di misura Nari...
Introduzione ai differenti approcci alla stima dell'incertezza di misura Nari...
 
BEARS: Towards an Evaluation Framework for Bandit-based Interactive Recommend...
BEARS: Towards an Evaluation Framework for Bandit-based Interactive Recommend...BEARS: Towards an Evaluation Framework for Bandit-based Interactive Recommend...
BEARS: Towards an Evaluation Framework for Bandit-based Interactive Recommend...
 
Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...
 
IRJET- Effectiveness of Constructivist Instructional Approach on Achievem...
IRJET-  	  Effectiveness of Constructivist Instructional Approach on Achievem...IRJET-  	  Effectiveness of Constructivist Instructional Approach on Achievem...
IRJET- Effectiveness of Constructivist Instructional Approach on Achievem...
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
 
Crystallization classification semisupervised
Crystallization classification semisupervisedCrystallization classification semisupervised
Crystallization classification semisupervised
 
Finding and Quantifying Temporal-Aware Contradiction in Reviews
Finding and Quantifying Temporal-Aware Contradiction in ReviewsFinding and Quantifying Temporal-Aware Contradiction in Reviews
Finding and Quantifying Temporal-Aware Contradiction in Reviews
 
A Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria Ratings
A Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria RatingsA Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria Ratings
A Novel Nonadditive Collaborative-Filtering Approach Using Multicriteria Ratings
 
A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...
 

Mais de nextlib

Hadoop Map Reduce Arch
Hadoop Map Reduce ArchHadoop Map Reduce Arch
Hadoop Map Reduce Arch
nextlib
 
D Rb Silicon Valley Ruby Conference
D Rb   Silicon Valley Ruby ConferenceD Rb   Silicon Valley Ruby Conference
D Rb Silicon Valley Ruby Conference
nextlib
 
Multi-core architectures
Multi-core architecturesMulti-core architectures
Multi-core architectures
nextlib
 
Aldous Huxley Brave New World
Aldous Huxley Brave New WorldAldous Huxley Brave New World
Aldous Huxley Brave New World
nextlib
 
Social Graph
Social GraphSocial Graph
Social Graph
nextlib
 
Ajax Prediction
Ajax PredictionAjax Prediction
Ajax Prediction
nextlib
 
SVD review
SVD reviewSVD review
SVD review
nextlib
 
Mongrel Handlers
Mongrel HandlersMongrel Handlers
Mongrel Handlers
nextlib
 
Blue Ocean Strategy
Blue Ocean StrategyBlue Ocean Strategy
Blue Ocean Strategy
nextlib
 
日本7-ELEVEN消費心理學
日本7-ELEVEN消費心理學日本7-ELEVEN消費心理學
日本7-ELEVEN消費心理學
nextlib
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms
nextlib
 
Agile Adoption2007
Agile Adoption2007Agile Adoption2007
Agile Adoption2007
nextlib
 
Modern Compiler Design
Modern Compiler DesignModern Compiler Design
Modern Compiler Design
nextlib
 
透过众神的眼睛--鸟瞰非洲
透过众神的眼睛--鸟瞰非洲透过众神的眼睛--鸟瞰非洲
透过众神的眼睛--鸟瞰非洲
nextlib
 
Improving Quality of Search Results Clustering with Approximate Matrix Factor...
Improving Quality of Search Results Clustering with Approximate Matrix Factor...Improving Quality of Search Results Clustering with Approximate Matrix Factor...
Improving Quality of Search Results Clustering with Approximate Matrix Factor...
nextlib
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
nextlib
 
Bigtable
BigtableBigtable
Bigtable
nextlib
 

Mais de nextlib (20)

Nio
NioNio
Nio
 
Hadoop Map Reduce Arch
Hadoop Map Reduce ArchHadoop Map Reduce Arch
Hadoop Map Reduce Arch
 
D Rb Silicon Valley Ruby Conference
D Rb   Silicon Valley Ruby ConferenceD Rb   Silicon Valley Ruby Conference
D Rb Silicon Valley Ruby Conference
 
Multi-core architectures
Multi-core architecturesMulti-core architectures
Multi-core architectures
 
Aldous Huxley Brave New World
Aldous Huxley Brave New WorldAldous Huxley Brave New World
Aldous Huxley Brave New World
 
Social Graph
Social GraphSocial Graph
Social Graph
 
Ajax Prediction
Ajax PredictionAjax Prediction
Ajax Prediction
 
Closures for Java
Closures for JavaClosures for Java
Closures for Java
 
A Content-Driven Reputation System for the Wikipedia
A Content-Driven Reputation System for the WikipediaA Content-Driven Reputation System for the Wikipedia
A Content-Driven Reputation System for the Wikipedia
 
SVD review
SVD reviewSVD review
SVD review
 
Mongrel Handlers
Mongrel HandlersMongrel Handlers
Mongrel Handlers
 
Blue Ocean Strategy
Blue Ocean StrategyBlue Ocean Strategy
Blue Ocean Strategy
 
日本7-ELEVEN消費心理學
日本7-ELEVEN消費心理學日本7-ELEVEN消費心理學
日本7-ELEVEN消費心理學
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms
 
Agile Adoption2007
Agile Adoption2007Agile Adoption2007
Agile Adoption2007
 
Modern Compiler Design
Modern Compiler DesignModern Compiler Design
Modern Compiler Design
 
透过众神的眼睛--鸟瞰非洲
透过众神的眼睛--鸟瞰非洲透过众神的眼睛--鸟瞰非洲
透过众神的眼睛--鸟瞰非洲
 
Improving Quality of Search Results Clustering with Approximate Matrix Factor...
Improving Quality of Search Results Clustering with Approximate Matrix Factor...Improving Quality of Search Results Clustering with Approximate Matrix Factor...
Improving Quality of Search Results Clustering with Approximate Matrix Factor...
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Bigtable
BigtableBigtable
Bigtable
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Comparing State-of-the-Art Collaborative Filtering Systems

  • 1. Comparing State-of-the-Art Collaborative Filtering Systems Laurent Candillier, Frank Meyer, Marc Boull´ e Introduction France Telecom R&D Lannion Collaborative approaches MLDM 2007 Experiments Conclusions 1 Introduction 2 Collaborative approaches 3 Experiments 4 Conclusions
  • 2. Recommender systems Help users find items they should appreciate from huge catalogues [Adomavicius and Tuzhilin, 2005] Introduction Collaborative approaches ⇒ Collaborative filtering : based on user to item rating matrix Experiments Conclusions i1 i2 i3 i4 i5 4 4 1 u1 4 3 u2 5 2 1 u3 4 5 u4 5 4 u5 5 3 u6 4 ? 1 u7
  • 3. User-based approaches Recommend items appreciated by users whose tastes are similar to the ones of the given user [Resnick et al., 1994] Introduction ⇒ need a similarity measure between users Collaborative approaches ex : pearson similarity : cosine of deviation from the mean Experiments Conclusions i ∈Sa ∩Su (vai − va )(vui − vu ) w (a, u) = − va )2 − vu )2 i ∈Sa ∩Su (vai i ∈Sa ∩Su (vui vui : rating of user u on item i Su : set of items rated by user u vu : mean rating of user u vui i ∈Su vu = |Su |
  • 4. User-based approaches Which rating for user a (active) on item i ? Introduction Collaborative approaches Prediction using weighted sum Experiments {u|i ∈Su } w (a, u) × vui Conclusions pai = {u|i ∈Su } |w (a, u)| Prediction using weighted sum of deviations from the mean {u|i ∈Su } w (a, u) × (vui − vu ) pai = va + {u|i ∈Su } |w (a, u)| How many neighbors considered ?
  • 5. Cluster-based approaches Recommend items appreciated by users that belong to the Introduction same group as the given user [Breese et al., 1998] Collaborative approaches Experiments ⇒ need Conclusions a clustering method : ex : K-means a distance measure : ex : euclidian distance Then the rating of a user on an item is the mean rating given by the users that belong to the same cluster How many clusters considered ?
  • 6. Item-based approaches Recommend items similar to those appreciated by the given user [Karypis, 2001] Introduction Collaborative approaches ⇒ dual of user-based approach Experiments Conclusions × (vaj − vj ) {j∈Sa |j=i } sim(i , j) pai = vi + {j∈Sa |j=i } |sim(i , j)| sim(i , j) : similarity measure between items i and j Sa : set of items rated by user a vi : mean rating on item i How many neighbors considered ?
  • 7. Experiments For user- and item-based approaches, choose similarity measure prediction scheme Introduction Collaborative neighborhood size K approaches For cluster-based approaches, choose Experiments distance measure Conclusions prediction scheme number of clusters Evaluation protocol [Herlocker et al., 2004] movie rating dataset : MovieLens (6040 × 3706) 10-fold cross validation (10 × 9/10th for learning) Mean Absolute Error Rate on test set T = {(u, i , r )} 1 MAE = |pui − r | |T | (u,i ,r )∈T
  • 8. User-based approaches, similarity measures MAE Introduction Pearson Collaborative Constraint approaches 0.8 Cosine Experiments Adjusted Conclusions Proba 0.76 0.72 0.68 0 500 1000 1500 2000 2500 K
  • 9. User-based approaches, prediction schemes MAE Introduction PearsonWeighted Collaborative PearsonDeviation approaches 0.8 ProbaWeighted Experiments ProbaDeviation Conclusions 0.76 0.72 0.68 0 500 1000 1500 2000 2500 K
  • 10. Item-based approaches, similarity measures MAE Introduction Pearson Collaborative Constraint approaches 0.76 Cosine Experiments Adjusted Conclusions Proba 0.72 0.68 0.64 0 200 400 600 800 1000 1200 1400 K
  • 11. Summary of experiments BestDefault BestUser BestItem BestCluster Introduction model construction 1 730 170 254 time (in sec.) Collaborative prediction time approaches 1 31 3 1 (in sec.) Experiments MAE 0.6829 0.6688 0.6382 0.6736 Conclusions BestDefault : Bayes minimizing MAE BestUser : pearson similarity, 1500 neighbors, prediction using deviation from the mean BestItem : probabilistic similarity, 400 neighbors, prediction using deviation from the mean BestCluster : K-means, euclidian distance, 4 clusters, prediction using Bayes minimizing MAE
  • 12. Conclusions Introduction Collaborative All approaches, and all their possible options, are tested approaches under exactly the same conditions Experiments Bayes is a good compromise : low error rate, low Conclusions execution time, incremental Deviation from the mean : better results, new for item-based approaches Similarity measures : pearson for user-based, probabilistic for item-based
  • 13. Conclusions The item-based approach Introduction Collaborative get the best performances in the experiments approaches seems to need fewer neighbors than user-based approach Experiments Conclusions is also appropriate to navigate in item catalogues even with no user information may naturally use content data about items to improve its results (idem for user-based approach with demographic data) results depend on the number of items compared to the number of users ?
  • 14. Next Need to scale well even when faced with huge datasets Introduction ex : netflix prize : 100,480,507 ratings from 480,189 users on Collaborative approaches 17,770 movies Experiments select most relevant users [Yu et al., 2002] Conclusions reduce dimensionality with PCA or SVD [Goldberg et al., 2001, Vozalis and Margaritis, 2005] create a set of super-users [Rashid et al., 2006] sampling ? stochastic ? bagging ? Combine approaches ⇒ ensemble methods [Polikar, 2006]
  • 15. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl (1994) Grouplens: an open architecture for collaborative filtering Introduction of netnews Collaborative approaches In Conference on Computer Supported Cooperative Work, Experiments pages 175–186. ACM Conclusions J. Breese, D. Heckerman and C. Kadie (1998) Empirical analysis of predictive algorithms for collaborative filtering In 14th Conference on Uncertainty in Artificial Intelligence, pages 43–52. Morgan Kaufman G. Karypis (2001) Evaluation of item-based top-N recommendation algorithms
  • 16. In 10th International Conference on Information and Knowledge Management, pages 247–254 K. Goldberg, T. Roeder, D. Gupta and C. Perkins (2001) Introduction Eigentaste: a constant time collaborative filtering Collaborative approaches algorithm Experiments Information Retrieval, 4(2):133–151 Conclusions K. Yu, X. Xu, J. Tao, M. Ester and H. Kriegel (2002) Instance selection techniques for memory-based collaborative filtering In SIAM Data Mining J. Herlocker, J. Konstan, L. Terveen and J. Riedl (2004) Evaluating collaborative filtering recommender systems ACM Transactions on Information Systems, 22(1):5–53 G. Adomavicius and A. Tuzhilin (2005)
  • 17. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions IEEE Transactions on Knowledge and Data Engineering, Introduction 17(6):734–749 Collaborative approaches M. Vozalis and K. Margaritis (2005) Experiments Applying SVD on item-based filtering Conclusions In 5th International Conference on Intelligent Systems Design and Applications, pages 464–469 A.M. Rashid, S.K. Lam, G. Karypis and J. Riedl (2006) ClustKNN: a highly scalable hybrid model- & memory-based CF algorithm In KDD Workshop on Web Mining and Web Usage Analysis R. Polikar (2006) Ensemble systems in decision making IEEE Circuits & Systems Magazine, 6(3):21–45