SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
WE KNOW YOU WILL LIKE THIS
                                 Introduction to Recommendation Engines




Monday, January 14, 13
ML
                                          X                     X     +Y




              Supervised                                             Unsupervised
                                                                           Clustering
                         T   + YT


                 X                  X    +Y

                                                                      Hierarchical Clustering
   Regression                           Classification
             Turnout                        Class
                  30                        Spam
          Y=         (numeric)          Y = Not Spam (Categorical)
                  12
                  25                        Spam
Monday, January 14, 13
MarabooKarnaf Ima Adama
                                                                                                Liv
                                                                    Idan      5      ?      3     ?
                                                                    Shahar    4      3      ?     2
                                                                    Gadi      ?      1      ?     5




                         Content/Model-Based
                                                                     (Agnostic, Behavioural)
                         (predicting the rating)

                                                   Recommendation
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Monday, January 14, 13
Preference Problem (Ads)




                         Rating Problem (Movies)




Monday, January 14, 13
Monday, January 14, 13
Related problem: Ranking




Monday, January 14, 13
Maraboo   Karnaf   Ima Adama Liv
                         Idan         1        ?         1        ?
                         Shahar       1        1         ?        1
                         Gadi        ?         1         ?        1




                                  Maraboo   Karnaf   Ima Adama Liv
                         Idan         5        ?         3        ?
                         Shahar       4        3         ?        2
                         Gadi        ?         1         ?        5




Monday, January 14, 13
Maraboo   Karnaf   Ima Adama Liv
              Idan           1        ?         1        ?
              Shahar         1        1         ?        1
              Gadi          ?         1         ?        1




Monday, January 14, 13
Maraboo   Karnaf   Ima Adama Liv
                  Idan         5        ?         3        ?
                  Shahar       4        3         ?        2
                  Gadi        ?         1         ?        5




Monday, January 14, 13
User-based Collaborative Filtering




Monday, January 14, 13
Monday, January 14, 13
Jaccard Distance                            “We share 5 preferences out of 7!”


          Euclidean Distance



            Cosine Similiarity


             Pearson’s
             Correlation      1-                                           “Our preferences go
             Distance                                                     in the same direction!”
                                                             (but only 2 such preferences do...)
             Log-Likelihood
             Ratio

                                   Measure of “Surprise” at correlation

Monday, January 14, 13
Item-Based Collaborative Filtering

          Usually bounded




Monday, January 14, 13
Case study: Amazon
                         100,000,000 users

                         2,000,000 items

                         Each user expresses preference for 10 items

                         Each item has 500 reviews
                         User-Based CF:                      Item-Based CF:

                         100,000,000 x 100,000,000           2,000,000 x 2,000,000 similarity
                         similarity matrix                   matrix

                         2,000,000 x 500 sum terms           2,000,000 x 10 sum terms

Monday, January 14, 13
Interpretability




                         “People who go to
                             La Colombe                  “Coffee Shop
                            Torrefaction &            connoisseurs tend
                         FourSquare HQ tend             to come here”
                             to go here”


Monday, January 14, 13
Evaluation
                         Rating Problem: Predictive accuracy (regression) metrics

                            RMSE, MAE, etc.

                         Preference (Binary) Problem: Classification accuracy (IR) metrics

                            Accuracy, Precision, Recall, F-1, ROC, etc.

                            Benchmark vs. ‘random’ and ‘popular’

                         Ranking accuracy metrics: Similarity of permutations

                            Pearson’s correlation, Spearman’s rho, Kendall’s tau

Monday, January 14, 13
Monday, January 14, 13
Challenges

                         Cold-start problems (new item, new user)

                         “Black” and “Grey” sheep

                         Exploration-exploitation and reinforcement learning

                         Scale




Monday, January 14, 13
Advanced Topics

                         Dimensionality Reduction

                         Map-Reducible calculations

                         Content-based (feature-based)

                         Multiple models




Monday, January 14, 13
MapReduce Similarity Calculation
                                          “User-based”
                                              A                                  ui
                           Maraboo Karnaf Ima Adama Liv                          Gadi                  Gadi
              Idan
              Shahar
                               1
                               1
                                          ?
                                          1
                                                    1
                                                    ?
                                                                ?
                                                                1   *   Maraboo
                                                                        Karnaf
                                                                                      ?
                                                                                      1
                                                                                          =   Idan
                                                                                              Shahar
                                                                                                          0
                                                                                                          2
              Gadi             ?          1         ?           1       Ima Adama     ?       Gadi        2
                                                                        Liv           1
                                                                                          User similarity vector
                                              AT                            Aui                                    T(Au )
                         Maraboo
                                   Idan
                                      1
                                              Shahar Gadi
                                                1           ?
                                                                    *   Idan
                                                                                 Gadi
                                                                                      0
                                                                                          =   Maraboo
                                                                                                       Gadi
                                                                                                              2
                                                                                                                   A   i
                         Karnaf       ?         1           1           Shahar        2       Karnaf          4
                         Ima Adama    1         ?           ?           Gadi          2       Ima Adama       0
                         Liv          ?         1           1                                 Liv             4




Monday, January 14, 13
MapReduce Similarity Calculation
                                          “Item-Based”
                                            A T                                                   A
                                   Idan       Shahar Gadi                        Maraboo Karnaf Ima Adama Liv                Maraboo Karnaf Ima Adama Liv
                         Maraboo      1         1           ?           Idan          1       ?       1       ?
                         Karnaf       ?         1           1       *   Shahar        1       1       ?       1   =   Maraboo
                                                                                                                      Karnaf
                                                                                                                                  2
                                                                                                                                  1
                                                                                                                                          1
                                                                                                                                          2
                                                                                                                                                1
                                                                                                                                                0
                                                                                                                                                       1
                                                                                                                                                       2
                         Ima Adama    1         ?           ?           Gadi          ?       1       ?       1       Ima Adama   1       0     1      0
                         Liv          ?         1           1                                                         Liv         1       2     0      2

                                                                                                                         Item similarity matrix
                                     ATA                                         ui
                           Maraboo Karnaf Ima Adama Liv                           Gadi                    Gadi
               Maraboo         2          1         1           1        Maraboo          ?       Maraboo     2
                                                                                              =
                                                                    *                                                                 T
                                                                                                                            (A A)ui
               Karnaf          1          2         0           2        Karnaf           1       Karnaf      4
               Ima Adama       1          0         1           0        Ima Adama        ?       Ima Adama   0
               Liv             1          2         0           2        Liv              1       Liv         4




                                                        Similarity of item x to item y is <ix,iy>

Monday, January 14, 13
MapReduce Similarity Calculation
                            Recall row outer-product matrix multiplication:
                                                                          Maraboo Karnaf Ima Adama Liv
                                                                   Maraboo     2       1     1      1
                                                                   Karnaf      1       2     0      2
                                                                   Ima Adama   1       0     1      0
                                                                   Liv         1       2     0      2


                                                                                       =
                                Maraboo Karnaf Ima Adama Liv              Maraboo Karnaf Ima Adama Liv              Maraboo Karnaf Ima Adama Liv
                         Maraboo     1       0     1      0        Maraboo     1       1     0      1        Maraboo     0       0     0      0
                         Karnaf
                         Ima Adama
                                     0
                                     1
                                             0
                                             0
                                                   0
                                                   1
                                                          0
                                                          0    +   Karnaf
                                                                   Ima Adama
                                                                               1
                                                                               0
                                                                                       1
                                                                                       0
                                                                                             0
                                                                                             0
                                                                                                    1
                                                                                                    0
                                                                                                         +   Karnaf
                                                                                                             Ima Adama
                                                                                                                         0
                                                                                                                         0
                                                                                                                                 1
                                                                                                                                 0
                                                                                                                                       0
                                                                                                                                       0
                                                                                                                                              1
                                                                                                                                              0
                         Liv         0       0     0      0        Liv         1       1     0      1        Liv         0       1     0      1



                              uIdanuIdan T                            uShaharuShahar
                                                                                   T                            uGadiuGadi   T

                                         Only one user’s list of items is used every time!

Monday, January 14, 13
MapReduce Similarity Calculation

                         All of the classic similarity functions are
                         made up of 3 stages:

                            Preprocess (uses only one ELEMENT)

                            Norm (Can be done in reduce on one
                                  VECTOR)
                                                     T
                            Similarity utilizes the A A matrix joined
                            with norm entries


Monday, January 14, 13
Bibliography
                         Google News Personalization: Scalable Online Collaborative Filtering - Das, Datar, Garg, Rajaram, WWW2007

                         Logistic Regression and Collaborative Filtering for Sponsored Search Term Recommendation - Bartz, Murthi, Sebastian, EC2006

                         Evaluating Collaborative Filtering Recommender Systems - Herlocker, Konstan, Tenveen, Riedl, ACM TIS2004

                         A Survey of Collaborative Filtering Techniques - Su, Khoshgoftaar, AAI2009

                         An Introduction to Information Retrieval - Manning, Raghavan, Schutze, Cambridge Press

                         Mahout in Action - Friedman, Dunning, Anil, Owen, Manning Publications

                         Lessons from the Netflix Prize Challenge - Bell, Koren, KDD2009

                         Factorization meets the Neighbourhood: a Multifaceted Collaborative Filtering Model - Koren, KDD2008

                         Accurate Methods for the Statistics of Surprise and Coincidence - Dunning, ACL1993

                         Item-Based Collaborative Filtering Recommendation Algorithms - Sarwar, Konstan, Karypis, Riedl, WWW2001

                         Matrix Factorization Techniques for Recommender Systems - Koren, Bell, Volinsky, IEEE2009

                         recommenderlab: A Framework for Developing and Testing Recommendation Algorithms - Hahsler, 2001

                         Scalable Similarity-Based Neighbourhood Methods with MapReduce - Schelter, Boden, Markl, RecSys2012



Monday, January 14, 13
Thanks!


                         Nimrod Priell
                         nimrod.priell@gmail.com
                         @nimrodpriell
                         http://www.educated-guess.com




Monday, January 14, 13

Mais conteúdo relacionado

Último

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 

Último (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 

Destaque

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destaque (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Collaborative filtering intro - Full

  • 1. WE KNOW YOU WILL LIKE THIS Introduction to Recommendation Engines Monday, January 14, 13
  • 2. ML X X +Y Supervised Unsupervised Clustering T + YT X X +Y Hierarchical Clustering Regression Classification Turnout Class 30 Spam Y= (numeric) Y = Not Spam (Categorical) 12 25 Spam Monday, January 14, 13
  • 3. MarabooKarnaf Ima Adama Liv Idan 5 ? 3 ? Shahar 4 3 ? 2 Gadi ? 1 ? 5 Content/Model-Based (Agnostic, Behavioural) (predicting the rating) Recommendation Monday, January 14, 13
  • 12. Preference Problem (Ads) Rating Problem (Movies) Monday, January 14, 13
  • 15. Maraboo Karnaf Ima Adama Liv Idan 1 ? 1 ? Shahar 1 1 ? 1 Gadi ? 1 ? 1 Maraboo Karnaf Ima Adama Liv Idan 5 ? 3 ? Shahar 4 3 ? 2 Gadi ? 1 ? 5 Monday, January 14, 13
  • 16. Maraboo Karnaf Ima Adama Liv Idan 1 ? 1 ? Shahar 1 1 ? 1 Gadi ? 1 ? 1 Monday, January 14, 13
  • 17. Maraboo Karnaf Ima Adama Liv Idan 5 ? 3 ? Shahar 4 3 ? 2 Gadi ? 1 ? 5 Monday, January 14, 13
  • 20. Jaccard Distance “We share 5 preferences out of 7!” Euclidean Distance Cosine Similiarity Pearson’s Correlation 1- “Our preferences go Distance in the same direction!” (but only 2 such preferences do...) Log-Likelihood Ratio Measure of “Surprise” at correlation Monday, January 14, 13
  • 21. Item-Based Collaborative Filtering Usually bounded Monday, January 14, 13
  • 22. Case study: Amazon 100,000,000 users 2,000,000 items Each user expresses preference for 10 items Each item has 500 reviews User-Based CF: Item-Based CF: 100,000,000 x 100,000,000 2,000,000 x 2,000,000 similarity similarity matrix matrix 2,000,000 x 500 sum terms 2,000,000 x 10 sum terms Monday, January 14, 13
  • 23. Interpretability “People who go to La Colombe “Coffee Shop Torrefaction & connoisseurs tend FourSquare HQ tend to come here” to go here” Monday, January 14, 13
  • 24. Evaluation Rating Problem: Predictive accuracy (regression) metrics RMSE, MAE, etc. Preference (Binary) Problem: Classification accuracy (IR) metrics Accuracy, Precision, Recall, F-1, ROC, etc. Benchmark vs. ‘random’ and ‘popular’ Ranking accuracy metrics: Similarity of permutations Pearson’s correlation, Spearman’s rho, Kendall’s tau Monday, January 14, 13
  • 26. Challenges Cold-start problems (new item, new user) “Black” and “Grey” sheep Exploration-exploitation and reinforcement learning Scale Monday, January 14, 13
  • 27. Advanced Topics Dimensionality Reduction Map-Reducible calculations Content-based (feature-based) Multiple models Monday, January 14, 13
  • 28. MapReduce Similarity Calculation “User-based” A ui Maraboo Karnaf Ima Adama Liv Gadi Gadi Idan Shahar 1 1 ? 1 1 ? ? 1 * Maraboo Karnaf ? 1 = Idan Shahar 0 2 Gadi ? 1 ? 1 Ima Adama ? Gadi 2 Liv 1 User similarity vector AT Aui T(Au ) Maraboo Idan 1 Shahar Gadi 1 ? * Idan Gadi 0 = Maraboo Gadi 2 A i Karnaf ? 1 1 Shahar 2 Karnaf 4 Ima Adama 1 ? ? Gadi 2 Ima Adama 0 Liv ? 1 1 Liv 4 Monday, January 14, 13
  • 29. MapReduce Similarity Calculation “Item-Based” A T A Idan Shahar Gadi Maraboo Karnaf Ima Adama Liv Maraboo Karnaf Ima Adama Liv Maraboo 1 1 ? Idan 1 ? 1 ? Karnaf ? 1 1 * Shahar 1 1 ? 1 = Maraboo Karnaf 2 1 1 2 1 0 1 2 Ima Adama 1 ? ? Gadi ? 1 ? 1 Ima Adama 1 0 1 0 Liv ? 1 1 Liv 1 2 0 2 Item similarity matrix ATA ui Maraboo Karnaf Ima Adama Liv Gadi Gadi Maraboo 2 1 1 1 Maraboo ? Maraboo 2 = * T (A A)ui Karnaf 1 2 0 2 Karnaf 1 Karnaf 4 Ima Adama 1 0 1 0 Ima Adama ? Ima Adama 0 Liv 1 2 0 2 Liv 1 Liv 4 Similarity of item x to item y is <ix,iy> Monday, January 14, 13
  • 30. MapReduce Similarity Calculation Recall row outer-product matrix multiplication: Maraboo Karnaf Ima Adama Liv Maraboo 2 1 1 1 Karnaf 1 2 0 2 Ima Adama 1 0 1 0 Liv 1 2 0 2 = Maraboo Karnaf Ima Adama Liv Maraboo Karnaf Ima Adama Liv Maraboo Karnaf Ima Adama Liv Maraboo 1 0 1 0 Maraboo 1 1 0 1 Maraboo 0 0 0 0 Karnaf Ima Adama 0 1 0 0 0 1 0 0 + Karnaf Ima Adama 1 0 1 0 0 0 1 0 + Karnaf Ima Adama 0 0 1 0 0 0 1 0 Liv 0 0 0 0 Liv 1 1 0 1 Liv 0 1 0 1 uIdanuIdan T uShaharuShahar T uGadiuGadi T Only one user’s list of items is used every time! Monday, January 14, 13
  • 31. MapReduce Similarity Calculation All of the classic similarity functions are made up of 3 stages: Preprocess (uses only one ELEMENT) Norm (Can be done in reduce on one VECTOR) T Similarity utilizes the A A matrix joined with norm entries Monday, January 14, 13
  • 32. Bibliography Google News Personalization: Scalable Online Collaborative Filtering - Das, Datar, Garg, Rajaram, WWW2007 Logistic Regression and Collaborative Filtering for Sponsored Search Term Recommendation - Bartz, Murthi, Sebastian, EC2006 Evaluating Collaborative Filtering Recommender Systems - Herlocker, Konstan, Tenveen, Riedl, ACM TIS2004 A Survey of Collaborative Filtering Techniques - Su, Khoshgoftaar, AAI2009 An Introduction to Information Retrieval - Manning, Raghavan, Schutze, Cambridge Press Mahout in Action - Friedman, Dunning, Anil, Owen, Manning Publications Lessons from the Netflix Prize Challenge - Bell, Koren, KDD2009 Factorization meets the Neighbourhood: a Multifaceted Collaborative Filtering Model - Koren, KDD2008 Accurate Methods for the Statistics of Surprise and Coincidence - Dunning, ACL1993 Item-Based Collaborative Filtering Recommendation Algorithms - Sarwar, Konstan, Karypis, Riedl, WWW2001 Matrix Factorization Techniques for Recommender Systems - Koren, Bell, Volinsky, IEEE2009 recommenderlab: A Framework for Developing and Testing Recommendation Algorithms - Hahsler, 2001 Scalable Similarity-Based Neighbourhood Methods with MapReduce - Schelter, Boden, Markl, RecSys2012 Monday, January 14, 13
  • 33. Thanks! Nimrod Priell nimrod.priell@gmail.com @nimrodpriell http://www.educated-guess.com Monday, January 14, 13